Re: indexing txt file

2009-04-15 Thread Alejandro Gonzalez
but you need to index the text inside these files, right?. you need to read the text from file and include it into a field into the XML (of course this field must be defined in the schema). you can do it using a script and post then the XML to Solr. what amount/rate of generated text files are

Re: indexing txt file

2009-04-15 Thread Fergus McMenemie
Hi all, I'm trying to use solr1.3 and trying to index a text file. I wrote a schema.xsd and a xml file. Just to make sure I understand things Do you just have one of these text files, containing many reports? Or Do you have many of these text files each containing one report? Also, is

Re: Disable logging in SOLR

2009-04-15 Thread Kraus, Ralf | pixelhouse GmbH
Bill Au schrieb: Have you tried setting logging level to OFF from Solr's admin GUI: http://wiki.apache.org/solr/SolrAdminGUI thx 4 the hint ! But after I restart my tomcat its all reseted to default ? :-( Greets -Ralf-

Re: indexing txt file

2009-04-15 Thread Shalin Shekhar Mangar
On Tue, Apr 14, 2009 at 10:37 PM, Alex Vu alex.v...@gmail.com wrote: I just want to be able to index my text file, and other files that carries the same format but with different IP address, ports, ect. Alex, Solr consumes XML (in a specifc format) and CSV. It can consume plain text through

Maven repositories

2009-04-15 Thread Gustavo Lopes
Hi, does anyone know the location of the maven snapshot repositories for solr 1.4-SNAPSHOT? Thanks -- Gustavo Lopes smime.p7s Description: S/MIME Cryptographic Signature

Re: Maven repositories

2009-04-15 Thread Shalin Shekhar Mangar
On Wed, Apr 15, 2009 at 3:30 PM, Gustavo Lopes galo...@mediacapital.ptwrote: Hi, does anyone know the location of the maven snapshot repositories for solr 1.4-SNAPSHOT? http://people.apache.org/repo/m2-snapshot-repository/org/apache/solr/ Disclaimer - Un-released artifacts built from trunk.

Re: Disable logging in SOLR

2009-04-15 Thread Bill Au
Yes, restarting Tomcat will reset things back to default. But you should be able to configure Tomcat to disable Solr logging since Solr uses JDK logging. Bill On Wed, Apr 15, 2009 at 4:51 AM, Kraus, Ralf | pixelhouse GmbH r...@pixelhouse.de wrote: Bill Au schrieb: Have you tried setting

Re: Disable logging in SOLR

2009-04-15 Thread Mark Miller
Kraus, Ralf | pixelhouse GmbH wrote: Hi, is there a way to disable all logging output in SOLR ? I mean the output text like : INFO: [core_de] webapp=/solr path=/update params={wt=json} status=0 QTime=3736 greets -Ralf- You probably do not want to totally disable logging in Solr. More

Re: Disable logging in SOLR

2009-04-15 Thread Kraus, Ralf | pixelhouse GmbH
Mark Miller schrieb: Kraus, Ralf | pixelhouse GmbH wrote: Hi, is there a way to disable all logging output in SOLR ? I mean the output text like : INFO: [core_de] webapp=/solr path=/update params={wt=json} status=0 QTime=3736 greets -Ralf- You probably do not want to totally disable

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-15 Thread Fergus McMenemie
On Apr 2, 2009, at 9:23 AM, Fergus McMenemie wrote: Grant, I should note, however, that the speed difference you are seeing may not be as pronounced as it appears. If I recall during ApacheCon, I commented on how long it takes to shutdown your Solr instance when exiting it. That time it

looking at the results of a distributed search using shards.

2009-04-15 Thread Fergus McMenemie
Hi, Having all kinds of fun with distributed search using shards:-) I have 30K documents indexed using DIH into an index. Another index contain documents indexed using solr-cell. I am using shards to search across both indexes. I am trying to format the results returned from solr such the

Re: solr 1.3 + tomcat 5.5

2009-04-15 Thread Shalin Shekhar Mangar
From the log it seems like there is a solr.xml inside var/lib/tomcat5/webapps/ which tomcat is trying deploy and failing. Very strange. You should remove that file and see if that fixes it. On Tue, Apr 14, 2009 at 11:35 PM, andrysha nihuhoid nihuh...@gmail.comwrote: Hi, got problem setting up

Re: Distinct terms in facet field

2009-04-15 Thread Shalin Shekhar Mangar
On Wed, Apr 15, 2009 at 1:13 AM, Harsch, Timothy J. (ARC-SC)[PEROT SYSTEMS] timothy.j.har...@nasa.gov wrote: How could I get a count of distinct terms for a given query? For example: The Wiki page http://wiki.apache.org/solr/SimpleFacetParameters has a section Facet Fields with No Zeros

Re: Index Replication or Distributed Search ?

2009-04-15 Thread Shalin Shekhar Mangar
On Wed, Apr 15, 2009 at 5:07 AM, ramanathan ramanat...@youinweb-inc.comwrote: Hi, Can someone provide a practical advice of how large a Solr search index can be? for a better performance for consumer facing media website?. The right answer is that it depends :) It depends on the number

Using CSV for indexing ... Remote Streaming disabled

2009-04-15 Thread vivek sar
Hi, I'm trying using CSV (Solr 1.4, 03/29) for indexing following wiki (http://wiki.apache.org/solr/UpdateCSV). I've updated the solrconfig.xml to have this lines, requestDispatcher handleSelect=true requestParsers enableRemoteStreaming=true multipartUploadLimitInKB=20480 /

Re: looking at the results of a distributed search using shards.

2009-04-15 Thread Grant Ingersoll
On Apr 15, 2009, at 11:18 AM, Fergus McMenemie wrote: Hi, Having all kinds of fun with distributed search using shards:-) I have 30K documents indexed using DIH into an index. Another index contain documents indexed using solr-cell. I am using shards to search across both indexes. I am

Commits taking too long

2009-04-15 Thread vivek sar
Hi, I've index where I commit every 50K records (using Solrj). Usually this commit takes 20sec to complete, but every now and then the commit takes way too long - from 10 min to 30 min. I see more delays as the index size continues to grow - once it gets over 5G I start seeing long commit

Re: indexing txt file

2009-04-15 Thread Alex Vu
what amount/rate of generated text files are you thinking about? I have 1TB worth of text files coming in every couple of minutes in real-time. In about 10 minute I will have 4TB worth of text files. Do you just have one of these text files, containing many reports? Do you have many of

Re: Commits taking too long

2009-04-15 Thread Mark Miller
vivek sar wrote: Hi, I've index where I commit every 50K records (using Solrj). Usually this commit takes 20sec to complete, but every now and then the commit takes way too long - from 10 min to 30 min. I see more delays as the index size continues to grow - once it gets over 5G I start

Re: looking at the results of a distributed search using shards.

2009-04-15 Thread Fergus McMenemie
On Apr 15, 2009, at 11:18 AM, Fergus McMenemie wrote: Hi, Having all kinds of fun with distributed search using shards:-) I have 30K documents indexed using DIH into an index. Another index contain documents indexed using solr-cell. I am using shards to search across both indexes. I am

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-15 Thread Ryan McKinley
The work being done is addressing the deletes, AIUI, but of course there are other things happening during shutdown, too. There are no deletes to do. It was a clean index to begin with and there were no duplicates. I have not followed this thread, so forgive me if this has already been

Re: looking at the results of a distributed search using shards.

2009-04-15 Thread Otis Gospodnetic
Ain't a FAQ, but could be. Look at JIRA and search for Brian, who made the same request a few months ago. I've often wondered if we could add info about the source shard, as well as whether a hit came from cache or not. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Re: DataImporter : Java heap space

2009-04-15 Thread Bryan Talbot
I think there is a bug in the 1.4 daily builds of data import handler which is causing the batchSize parameter to be ignored. This was probably introduced with more recent patches to resolve variables. The affected code is in JdbcDataSource.java String bsz =

Re: Question on StreamingUpdateSolrServer

2009-04-15 Thread Otis Gospodnetic
Quick comment - why so shy with number of open file descriptors? On some nothing-special machines from several years ago I had this limit set to 30K+ - here, for example: http://www.simpy.com/user/otis :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original

Re: Question on StreamingUpdateSolrServer

2009-04-15 Thread Otis Gospodnetic
One more thing. I don't think this was mentioned, but you can: - optimize your indices - use compound index format That will lower the number of open file handles. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar

StreamingUpdateSolrServer and DIH

2009-04-15 Thread Marc Sturlese
Hey there, I have been reading about StreamingUpdateSolrServer but can't catch exactly how it works: More efficient index construction over http with solrj. If your doing it, this is a fantastic performance improvement. Adding a StreamingUpdateSolrServer that writes update commands to an open

Re: Question on StreamingUpdateSolrServer

2009-04-15 Thread vivek sar
Thanks Otis. I did increase the number of file descriptors to 22K, but I still get this problem. I've noticed following so far, 1) As soon as I get to around 1140 index segments (this is total over multiple cores) I start seeing this problem. 2) When the problem starts occassionally the index

Re: StreamingUpdateSolrServer and DIH

2009-04-15 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Apr 16, 2009 at 3:45 AM, Marc Sturlese marc.sturl...@gmail.com wrote: Hey there, I have been reading about StreamingUpdateSolrServer but can't catch exactly how it works: More efficient index construction over http with solrj. If your doing it, this is a fantastic performance

Re: DataImporter : Java heap space

2009-04-15 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Bryan, Thanks a lot. It is invoking the wrong method it should have been bsz = context.getVariableResolver().replaceTokens(bsz); it was a silly mistake --Noble On Thu, Apr 16, 2009 at 2:13 AM, Bryan Talbot btal...@aeriagames.com wrote: I think there is a bug in the 1.4 daily builds of data

want to Unsubscribe from Solr Mailing List

2009-04-15 Thread Neha Bhardwaj
Hi, I wish to unsubscribe from list . My email address is neha_bhard...@peristent.co.in Thanks for all the help and support. Thanks and Regards, Neha Bhardwaj| Software Engineer| Persistent Systems Limited Neha mailto:neha%20bhard...@persistent.co.in%20 bhard...@persistent.co.in

Re: DataImporter : Java heap space

2009-04-15 Thread Mani Kumar
Aah, Bryan you got it ... Thanks! Noble: so i can hope that it'll be fixed soon :) thank you for fixing it ... please lemme know when its done.. Thanks! Mani Kumar 2009/4/16 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com Hi Bryan, Thanks a lot. It is invoking the wrong method it should have

Re: want to Unsubscribe from Solr Mailing List

2009-04-15 Thread Mani Kumar
Dear Lady, this information available on http://lucene.apache.org/solr/mailing_lists.html page. Thank you for unsubscribing! -Mani On Thu, Apr 16, 2009 at 10:16 AM, Neha Bhardwaj neha_bhard...@persistent.co.in wrote: Hi, I wish to unsubscribe from list . My email address is