Indexing MultiCore

2010-03-05 Thread Suram
Hi, How can i Indexing the xml file in multicoreAdmin. while tried to execute the following comman am getting error like this: \solr\example\exampledocsjava -Ddata=args -Dcom ocalhost:8080/solr/core0/update -jar post.jar Example.xml Mar 5, 2010 3:37:00 PM

Re: Clustering from anlayzed text instead of raw input

2010-03-05 Thread Stanislaw Osinski
I'll give a try to stopwords treatbment, but the problem is that we perform POS tagging and then use payloads to keep only Nouns and Adjectives, and we thought that could be interesting to perform clustering only with these elements, to avoid senseless words. POS tagging could help a lot

Re: Warning : no lockType configured for...

2010-03-05 Thread Mani EZZAT
Should I fill a bug ? Mani EZZAT wrote: I tired using the default solrconfig and schema (from the example in 1.3 release) and I still get the same warnings When I look at the log, the solrconfig seems correcly loaded, but something is strange : newSearcher warming query from

Store input text after analyzers and token filters

2010-03-05 Thread JCodina
In an stored field, the content stored is the raw input text. But when the analyzers perform some cleaning or interesting transformation of the text, then it could be interesting to store the text after the tokenizer/Filter chain there is a way to do this? To be able to get back the text of the

Re: Clustering Search taking 4sec for 100 results

2010-03-05 Thread Stanislaw Osinski
Hi, It might be also interesting to add some logging of clustering time (just filed: https://issues.apache.org/jira/browse/SOLR-1809) to see what the index search vs clustering proportions are. Cheers, S. On Fri, Mar 5, 2010 at 03:26, Erick Erickson erickerick...@gmail.comwrote: Search time

Re: Store input text after analyzers and token filters

2010-03-05 Thread Ahmet Arslan
In an stored field, the content stored is the raw input text. But when the analyzers perform some cleaning or interesting transformation of the text, then it could be interesting to store the text after the tokenizer/Filter chain there is a way to do this? To be able to get back the text

Re: Can I used .XML files instead of .OSM files

2010-03-05 Thread mamathahl
The body field is of string type. When it was tried giving text, it gives error. There is nothing called Textparser. Its a stringparser. The body content of a few records are really huge. I am not sure whether string can handle such huge amount of data. When ant index is done, it says

Re: get english spell dictionary

2010-03-05 Thread michaelnazaruk
Hi,all! Tell my please, where I can get spell dictionary for solr? -- View this message in context: http://old.nabble.com/english-%28american%29-spell-dictionary-tp27778741p27793939.html Sent from the Solr - User mailing list archive at Nabble.com.

example solr xml working fine but my own xml files not working

2010-03-05 Thread venkatesh uruti
I am trying to imoport xml file in solr, it is successfully importing, but it is not showing any results while sarching in solr in solr home/example docs/ directory all example xmls are working fine but when i create a new XML file and trying to upload to solr its not flying can any one please

Re: Can I used .XML files instead of .OSM files

2010-03-05 Thread Erick Erickson
I think you need to back up a step or three here. If I'm reading your messages right, you've essentially taken an arbitrary file, renamed it and tried to index it. This won't work unless you make your schema match, and the xml file has the proper tags. SOLR doesn't magically index arbitrary XML.

Re: example solr xml working fine but my own xml files not working

2010-03-05 Thread Erick Erickson
Does your new xml follow the same structure of the example? That is, add doc fielda/fielda fieldb/fieldb /doc /add ? Have you tried looking at the results with the admin page to see what's actually in your index? More data please. What did you do to try to index your new data? What response did

Re: highlight multi-valued field returns weird cut-off highlighted terms

2010-03-05 Thread Koji Sekiguchi
uwdanny wrote: in this error case, the origin query q=pizza field name=TEST_KEYWORDS type=text_keep_stopwords multiValued=true indexed=true stored=true termVectors=false omitNorms=true/ fieldType name=text_keep_stopwords class=solr.TextField positionIncrementGap=100

Stemming

2010-03-05 Thread Suram
Hi, How can i set Features of stemming (set for Italian) anyone can tell me -- View this message in context: http://old.nabble.com/Stemming-tp27794521p27794521.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Documents disappearing

2010-03-05 Thread Pascal Dimassimo
Hi, hossman wrote: : We index using 4 processes that read from a queue of documents. Each process : send one document at a time to the /update handler. Hmmm.. then you should have a message from the LogUpdateProcessorFactory for every individual add command that was recieved ... did

Re: Store input text after analyzers and token filters

2010-03-05 Thread JCodina
Thanks, It can be useful as a workarrond, but I get a vector not a result that I may use wherever I could used the stored text. I'm thinking in clustering. Ahmet Arslan wrote: In an stored field, the content stored is the raw input text. But when the analyzers perform some cleaning or

Re: Stemming

2010-03-05 Thread Grant Ingersoll
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SnowballPorterFilterFactory On Mar 5, 2010, at 9:24 AM, Suram wrote: Hi, How can i set Features of stemming (set for Italian) anyone can tell me -- View this message in context:

Re: SolrJ commit options

2010-03-05 Thread Jerome L Quinn
Shalin Shekhar Mangar shalinman...@gmail.com wrote on 02/25/2010 07:38:39 AM: On Thu, Feb 25, 2010 at 5:34 PM, gunjan_versata gunjanga...@gmail.comwrote: We are using SolrJ to handle commits to our solr server.. All runs fine.. But whenever the commit happens, the server becomes slow and

SolrConfig - constructing the object

2010-03-05 Thread Kimberly Kantola
Hi All, I am new to using the Solr classes in development. I am trying to determine how to create a SolrConfig object. Is it just a matter of calling new SolrConfig with the location of the solrconfig.xml file ? SolrConfig config = new SolrConfig(/path/to/solrconfig.xml); Thanks for

Re: SolrConfig - constructing the object

2010-03-05 Thread Mark Miller
On 03/05/2010 10:29 AM, Kimberly Kantola wrote: Hi All, I am new to using the Solr classes in development. I am trying to determine how to create a SolrConfig object. Is it just a matter of calling new SolrConfig with the location of the solrconfig.xml file ? SolrConfig config = new

Re: Can I used .XML files instead of .OSM files

2010-03-05 Thread mamathahl
Thanks for your valuable suggestion. My XML file does not contain add doc tags at all. Its just of this format row id=1 lat=43.7895 lng=-73.1289 body=.Some text /row According to me(if my understanding is right), adddoc posts each record into a Solr document. This could be done, if I

how to boost first token

2010-03-05 Thread Сергей Кашин
I have some documents in Solr index like this doc arr name=brand strtoyota/str /arr arr name=name strshock front/str /arr /doc doc arr name=brand strtoyota/str /arr arr name=name

Re: highlight multi-valued field returns weird cut-off highlighted terms

2010-03-05 Thread uwdanny
Thanks a lot Koji; I'll do some deep diving on my tokenizer modification part. appreciate the pointers! Koji Sekiguchi-2 wrote: uwdanny wrote: in this error case, the origin query q=pizza field name=TEST_KEYWORDS type=text_keep_stopwords multiValued=true indexed=true stored=true

indexing a huge data

2010-03-05 Thread Mark N
what should be the fastest way to index a documents , I am indexing huge collection of data after extracting certain meta - data information for example author and filename of each files i am extracting these information and storing in XML format for example : fileid 1fileidauthorabc /author

Re: indexing a huge data

2010-03-05 Thread Joe Calderon
ive found the csv update to be exceptionally fast, though others enjoy the flexibility of the data import handler On Fri, Mar 5, 2010 at 10:21 AM, Mark N nipen.m...@gmail.com wrote: what should be the fastest way to index a documents , I am indexing huge collection of data after extracting

Comma delimited Search Strings for Location Data

2010-03-05 Thread Kevin Penny
Hello - article http://www.gissearch.com/location_extraction_solr Background: (on solr 1.3) We're doing a similar thing with our location data - however we're finding that if the 'input' string is one long string i.e.: q= *Philadelphia,PA,19103,US* that we're getting *0* matches - Instead

Re: Comma delimited Search Strings for Location Data

2010-03-05 Thread Kevin Penny
With the 2 searches: Here's the debug output: q=Pittsburgh,PA,15222,US parsedquery_toString *text:pittsburgh pa 15222 us* (returns 0 matches) q=Pittsburgh, PA, 15222, US parsedquery_toString *text:pittsburgh text:pa text:15222 text:us** (*returns x matches) So the first query is searching on

Searching, indexing, not matching.

2010-03-05 Thread John Ament
Hey So I just downloaded and am trying solr 1.4, wonderful tool. One thing I noticed, I created a data config, looks something like this: dataConfig dataSource type=JdbcDataSource driver=oracle.jdbc.pool.OracleDataSource url=jdbc:oracle:thin:@... user=... password=.../ document entity

Re: indexing a huge data

2010-03-05 Thread Otis Gospodnetic
That is indeed the fastest way in. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Joe Calderon calderon@gmail.com To: solr-user@lucene.apache.org Sent: Fri, March 5, 2010 2:36:29

Re: Stemming

2010-03-05 Thread Otis Gospodnetic
Suram, You have to use Italian-specific analyzer: http://www.search-lucene.com/?q=italian Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message From: Suram reactive...@yahoo.com To:

Re: get english spell dictionary

2010-03-05 Thread Otis Gospodnetic
Hi, As in a list of (common) English words? My Ubuntu has /usr/share/dict/american-english and british-english with about 100K words each. See also: http://www.search-lucene.com/?q=%2Benglish+%2Bdictionary Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem

Re: Store input text after analyzers and token filters

2010-03-05 Thread Otis Gospodnetic
Hi Joan, You could use the FieldAnalysisRequestHandler: http://www.search-lucene.com/?q=FieldAnalysisRequestHandler Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message From: JCodina

Re: Searching, indexing, not matching.

2010-03-05 Thread Otis Gospodnetic
John, Maybe your default search field is set to some field that doesn't have dell in it. The default search field is specified in schema.xml. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message

Re: Comma delimited Search Strings for Location Data

2010-03-05 Thread Otis Gospodnetic
If you want to treat commas as spaces, one quick and dirty way of doing that is this: s/,/, /g Do that to the query string before you send it to Solr and you are done. :) Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search ::

Re: SolrJ commit options

2010-03-05 Thread Otis Gospodnetic
Jerry, This is why people often do index modifications on one server (master) and replicate the read-only index to 1+ different servers (slaves). If you do that, does the problem go away? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search ::

Re: facet on null value

2010-03-05 Thread Lance Norskog
(I don't know where filter queries came in.) If you get a result with - lst name=facet_counts lst name=facet_queries / - lst name=facet_fields - lst name=features int name=040/int int name=00060/int int name=120/int int2/int /lst /lst /lst and you want to get facets of '000' and Null,

Re: SolrJ commit options

2010-03-05 Thread Lance Norskog
One technique to control commit times is to do automatic commits: you can configure a core to commit every N seconds (really milliseconds, but less than 5 minutes becomes difficult) and/or every N documents. This promotes a more fixed amount of work per commit. Also, the maxMergeDocs parameter

Re: SolrJ commit options

2010-03-05 Thread gunjan_versata
But can anyone explain me the use of these parameters.. I have read upon it.. what i could not understand was.. if can i set both the params to false, after how much time will my changes start reflecting? -- View this message in context:

Re: example solr xml working fine but my own xml files not working

2010-03-05 Thread Suram
venkatesh uruti wrote: I am trying to imoport xml file in solr, it is successfully importing, but it is not showing any results while sarching in solr in solr home/example docs/ directory all example xmls are working fine but when i create a new XML file and trying to upload to solr

multiCore

2010-03-05 Thread Suram
Hi, how can i send the xml file to solr after created the multicore.i tried it refuse accept -- View this message in context: http://old.nabble.com/multiCore-tp27802043p27802043.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: multiCore

2010-03-05 Thread Siddhant Goel
Can you provide the error message that you got? On Sat, Mar 6, 2010 at 11:13 AM, Suram reactive...@yahoo.com wrote: Hi, how can i send the xml file to solr after created the multicore.i tried it refuse accept -- View this message in context:

Re: multiCore

2010-03-05 Thread Suram
Siddhant Goel wrote: Can you provide the error message that you got? On Sat, Mar 6, 2010 at 11:13 AM, Suram reactive...@yahoo.com wrote: Hi, how can i send the xml file to solr after created the multicore.i tried it refuse accept -- View this message in context:

Re: Faceted search in 2 indexes

2010-03-05 Thread Kranti™ K K Parisa
Hi, Even I am looking for a solution for this. case: Index1: has the meta data and the contents of the files (basically read only for the end users) Index2: will have the tags attached to the search results that user may get out of index1 (so read/write). so next time when user searches it