Activity report on

  *[JIRA] New Feature SKER4949 - Solr SearchCommand implementation*

  Scarab Link: http://sesat.no/scarab/issues/id/SKER4949
  Module: Sesat> Kernel


  Activity generated by Glenn-Erik Sandbakken ([EMAIL PROTECTED]) at 08/20/2008 
11:56

  *Reasons for the changes*


  *Comments*
  - By Glenn-Erik Sandbakken - 08/20/2008 11:56 ---
  "Mick or anyone else:
Have you had the chance to look at the result yet?
Can you use this for developing the query suggestion solution?
You should search for "name:word or phrase"
Have a look at it and give me some feedback on what has high priority to change 
for this to help you create the front end solution.

The index build I started yesterday is done and I have some more cool 
statistics for you:
Index size: 2.1GB, XML size: 2.4GB
Index time: 4.3 horus.
# of Docs in index: 7.261.918
The # of docs in index is surprisingly small as we actually have ~16 mill 
entries all together in the lists.
However as I said in the jira it was only a first feed that I made asap because 
I was so anxious to see the solr result (my/our first solr index and result 
ever=)
There were some formatting problems when feeding the xml I had generated from 
the lists, so I just removed all [^a-z0-9_ ] characters.
It may have created a lot of similar entries. I am counting as I type the # of 
unique list entries that was given to solr.
I will do a more comprehensive formatting when creating the next xml (solr 
fixml=) document to index.

In addition I have experienced character encoding issues with solr. I have made 
sure all data is utf8 all the way (and solr uses utf8)
And solr seems to match well non a-z0-9 characters (such as æ,ø,å,³,etc) but I 
experienced errors when using the solr admin interface.
For me, the first url below failed, and the second didn't:
http://sch-solr-test01.dev.osl.basefarm.net:8080/solr/select/?q=øl&version=2.2&start=0&rows=10&indent=on
VS
http://sch-solr-test01.dev.osl.basefarm.net:8080/solr/select/?q=%F8l&version=2.2&start=0&rows=10&indent=on
it seems solar receives the data with wrong encoding, perhaps it's just because 
the input form in the solr admin interface don't say which encoding to use, and 
firefox then chooses iso.
If thats the reason this shouldn't give us any trouble, just be aware of it 
when you test solr from the solr admin pages.

Stay tuned, I'll get back with more after lunch."
_______________________________________________
Kernel-issues mailing list
[email protected]
http://sesat.no/mailman/listinfo/kernel-issues

Svar til