Activity report on *[JIRA] New Feature SKER4949 - Solr SearchCommand implementation*
Scarab Link: http://sesat.no/scarab/issues/id/SKER4949 Module: Sesat> Kernel Activity generated by Glenn-Erik Sandbakken ([EMAIL PROTECTED]) at 08/20/2008 11:56 *Reasons for the changes* *Comments* - By Glenn-Erik Sandbakken - 08/20/2008 11:56 --- "Mick or anyone else: Have you had the chance to look at the result yet? Can you use this for developing the query suggestion solution? You should search for "name:word or phrase" Have a look at it and give me some feedback on what has high priority to change for this to help you create the front end solution. The index build I started yesterday is done and I have some more cool statistics for you: Index size: 2.1GB, XML size: 2.4GB Index time: 4.3 horus. # of Docs in index: 7.261.918 The # of docs in index is surprisingly small as we actually have ~16 mill entries all together in the lists. However as I said in the jira it was only a first feed that I made asap because I was so anxious to see the solr result (my/our first solr index and result ever=) There were some formatting problems when feeding the xml I had generated from the lists, so I just removed all [^a-z0-9_ ] characters. It may have created a lot of similar entries. I am counting as I type the # of unique list entries that was given to solr. I will do a more comprehensive formatting when creating the next xml (solr fixml=) document to index. In addition I have experienced character encoding issues with solr. I have made sure all data is utf8 all the way (and solr uses utf8) And solr seems to match well non a-z0-9 characters (such as æ,ø,å,³,etc) but I experienced errors when using the solr admin interface. For me, the first url below failed, and the second didn't: http://sch-solr-test01.dev.osl.basefarm.net:8080/solr/select/?q=øl&version=2.2&start=0&rows=10&indent=on VS http://sch-solr-test01.dev.osl.basefarm.net:8080/solr/select/?q=%F8l&version=2.2&start=0&rows=10&indent=on it seems solar receives the data with wrong encoding, perhaps it's just because the input form in the solr admin interface don't say which encoding to use, and firefox then chooses iso. If thats the reason this shouldn't give us any trouble, just be aware of it when you test solr from the solr admin pages. Stay tuned, I'll get back with more after lunch."
_______________________________________________ Kernel-issues mailing list [email protected] http://sesat.no/mailman/listinfo/kernel-issues
