Solr 1.4.1 missing CSV queryResponseWriter?

2011-03-21 Thread danomano
Hi folks, I was running 1.4.0, but today I'm trying to switch to 1.4.1 (which
appears to work with Zoie), however when I try to retrieve data in CSV
format, wt=csv does not appear to work anymore.

Looking at the solrconfig.xml I see: (I tried pulling out the CSV entry, but
that failed (class not found), and indeed I cannot find any
CSVResponseWriter.java anywhere in the source tree...(I must be missing
something)..note I am able to upload  data via the udpate/csv/ url.

 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-1-4-1-missing-CSV-queryResponseWriter-tp2711283p2711283.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Segments and Memory Correlate?

2011-03-21 Thread danomano
yes we are having physical memory issues, but do does anyone know for a fact
if there is direct correlation between be segment counts and RAM - memory?

i.e. when the system begins a search on a segment, does it load that
segments full index.  Or does the system load All index data for all
Segments?

i.e. if I had for example 1000 segments, and it only loaded the index for
each segment it was searching, (assuming it 'indexes' through the
segments..hence this is why it would take longer to query on system with
many segments..as it must go through each segment one at a time), I would
assume this would require significantly less memory then loading 1 full
index on a system with only 10 segments. (assuming same total Index size).

Also I am considering switching to Zoie, but have encountered some
issues..(will posting on that shortly).

Dan

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Segments-and-Memory-Correlate-tp2694747p2710586.html
Sent from the Solr - User mailing list archive at Nabble.com.


Segments and Memory Correlate?

2011-03-17 Thread danomano
Hi folks, I ran into problem today where I am no longer able to execute any
queries :( due to Out of Memory issues.

I am in the process of investigating the use of different mergeFactors, or
even different merge policies all together.
My question is if I have many segments (i.e. smaller sized segments), will
that also reduce the total memory in RAM required for searching?  (my System
is currently allocated 8GB ram and has a ~255GB index).  (I'm not fully up
on the 'default merge policy' but I believe with a mergeFactor of 10, that
would mean each segment should be approaching about 25Gb? with ~543 million
documents

of note: this is all running on 1 server.

As seen below.

SEVERE: java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.search.cache.LongValuesCreator.fillLongValues(LongValuesCreator.java:141)
at
org.apache.lucene.search.cache.LongValuesCreator.validate(LongValuesCreator.java:84)
at
org.apache.lucene.search.cache.LongValuesCreator.create(LongValuesCreator.java:74)
at
org.apache.lucene.search.cache.LongValuesCreator.create(LongValuesCreator.java:37)
at
org.apache.lucene.search.FieldCacheImpl$Cache.createValue(FieldCacheImpl.java:155)
at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:188)
at
org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:337)
at
org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:504)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:207)
at org.apache.lucene.search.Searcher.search(Searcher.java:101)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1389)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1285)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:344)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:273)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:210)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1324)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at
com.openmarket.servletfilters.LogToCSVFilter.doFilter(LogToCSVFilter.java:89)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at
com.openmarket.servletfilters.GZipAutoDeflateFilter.doFilter(GZipAutoDeflateFilter.java:66)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
...etc

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Segments-and-Memory-Correlate-tp2694747p2694747.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Hanging all of sudden with update/csv

2011-03-11 Thread danomano
Sweet, those links very very useful :).

and should most definitely help :)

One overriding concern I have:
1) if I were to simply update the config to use a different mergeFactor, and
restart the solr server, (would it then adjust the segments accordingly?) or
would I need to start from scratch..(i.e. re-index all the data).

2) like above, should I chose a new mergingPolicy (such as Zoie),I take I
would need to rebuild the entire index from scratch?

I suspect I'm going to have to bullet, and do all from scratch again :(
Is there a perhaps a tool that enables taking once index reprocessing it
again by forwarding all the data into another solr instance?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Hanging-all-of-sudden-with-update-csv-tp2652903p2666417.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Hanging all of sudden with update/csv

2011-03-09 Thread danomano
After About 4-5 hours the merge completed (ran out of heap)..as you
suggested..it was having memory issues..

Read queries during the merge were working just fine (they were taking
longer then normal ~30-60seconds).

I think I need to do more reading on understanding the merge/optimization
processes.

I am beginning to think what I need to do is have lots of segments? (i.e.
frequent merges..of smaller sized segments, wouldn't that speed up the
merging process when it actually runs?).

A couple things I'm trying to wrap my ahead around:

Increasing the segments will improve indexing speed on the whole.
The question I have is: when it needs to actually perform a merge: will
having more segments be better  (i.e. make the merge process faster)? or
longer? ..having a 4 hour merge aka (indexing request) is not really
acceptable (unless I can control when that merge happens).

We are using our Solr server differently then most: Frequent Inserts (in
batches), with few Reads.

I would say having a 'long' query time is acceptable (say ~60 seconds).





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Hanging-all-of-sudden-with-update-csv-tp2652903p2656457.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Hanging all of sudden with update/csv

2011-03-08 Thread danomano
Actually this is definitely not a ram issue.  I have visualVM connected and
MAX Ram available for the JavaVM is ~7GB, but the system is only using
~5.5GB, with a MAX so far of 6.5GB consumed.

I think..well I'm guessing the system hit a merge threshold, but I can't
tell for sure..I have seen the index size grow rapidly today (much more then
normal, in the last 3 hours the index size has increased by about 50%).  
>From various posts I see that during the 'optimize' (which I have not
called), or the perhaps the merging of segments it is normal for the disk
space requirements to temporarily increase by 2x to 3x.  As such my only
assumption is that it must be conducing a merge.  
Note: since I restarted the solr server, I have only 1 client thread pushing
data in (it already transmitted the data.(~2mb)). (and it has been held up
for about 4 hours now..I believe its stuck waiting for the merge thread to
complete).

Is there a better way to handle merging? or at least predicting when it will
occur? (I'm essentially using the defaults MergeFactor:10, ramBuffer 32MB).

I'm totally new to solr/lucune/indexing in generaly so I'm so what clueless
on all this..
It should be noted we have 'millions of documents' all which are generally <
4k bytes.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Hanging-all-of-sudden-with-update-csv-tp2652903p2653423.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Hanging all of sudden with update/csv

2011-03-08 Thread danomano
Hi folks, I've been using solr for about 3 months.

Our Solr install is a single node, and we have been injecting logging data
into the solr server every couple of minutes, which each updating taking few
minutes.

Everything working fine until this morning, at which point it appeared that
all updates were hung.

Retarting the solr server did not help, as all updaters immediately 'hung'
again.

Poking around in the threads, and strace, I do in fact see stuff happening.

The index size itself is about 270Gb, (we are hopping to support upto
500-1TB), and have supplied the system with ~3TB diskspace.

Any Tips on what could be happening?
notes: we have never run an optimize yet.
  we have never deleted from system yet.


The merge Thread appears to be the one..'never returnning'
"Lucene Merge Thread #0" - Thread t@41
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileDispatcher.pread0(Native Method)
at sun.nio.ch.FileDispatcher.pread(FileDispatcher.java:31)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:234)
at sun.nio.ch.IOUtil.read(IOUtil.java:210)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:622)
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:139)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:94)
at org.apache.lucene.store.DataOutput.copyBytes(DataOutput.java:176)
at
org.apache.lucene.index.FieldsWriter.addRawDocuments(FieldsWriter.java:209)
at
org.apache.lucene.index.SegmentMerger.copyFieldsNoDeletions(SegmentMerger.java:424)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:332)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153)
at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4053)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3645)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407)


Some ptrace output:
23178 pread(172,
"\270\316\276\2\245\371\274\2\271\316\276\2\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2"...,
4096, 98004192) = 4096 <0.09>
23178 pread(172,
"\245\371\274\2\271\316\276\2\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2"...,
4096, 98004196) = 4096 <0.09>
23178 pread(172,
"\271\316\276\2\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2"...,
4096, 98004200) = 4096 <0.08>
23178 pread(172,
"\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2"...,
4096, 98004204) = 4096 <0.08>
23178 pread(172,
"\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2"...,
4096, 98004208) = 4096 <0.08>
23178 pread(172,
"\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2"...,
4096, 98004212) = 4096 <0.09>
23178 pread(172,
"\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2"...,
4096, 98004216) = 4096 <0.08>
23178 pread(172,
"\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2"...,
4096, 98004220) = 4096 <0.09>
23178 pread(172,
"\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2"...,
4096, 98004224) = 4096 <0.13>
22688 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed
out) <0.051276>
23178 pread(172,
"\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2"...,
4096, 98004228) = 4096 <0.10>
22688 futex(0x464a9f28, FUTEX_WAKE_PRIVATE, 1 
23178 pread(172,
"\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2"...,
4096, 98004232) = 4096 <0.10>
22688 <... futex resumed> ) = 0 <0.51>
23178 pread(172,
"\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2"...,
4096, 98004236) = 4096 <0.10>
22688 clock_gettime(CLOCK_MONOTONIC,  
23178 pread(172,
"\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2\310\316\276\2"...,
4096, 98004240) = 4096 <0.10>
22688 <... clock_gettime resumed> {1900472, 454038316}) = 0 <0.54>
23178 pread(172,
"\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2\310\316\276\2\311\316\276\2"...,
4096, 98004244) = 4096 <0.11>
22688 clock_gettime(CLOCK_MON

Re: does solr support posting gzipped content?

2010-11-16 Thread danomano

sorry, yes by inject I simply mean post, I was hoping to get away without
writing any 'native' solr code to upload gzip files, but its sounds like
that is not possible.  (The file's that I'm uploading (aka posting are CSV
formatted).

I will poke around and which solution ServletFilter/DataImportHandler fits
best.  (I like the ServletFilter as its entirely agnostic to the underlying
indexing solution).




-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/does-solr-support-posting-gzipped-content-tp1733178p1912875.html
Sent from the Solr - User mailing list archive at Nabble.com.


does solr support posting gzipped content?

2010-10-19 Thread danomano

Hi folks, I was wondering if there is any native support for posting gzipped
files to solr?

i.e. I'm testing a project where we inject our log files into solr for
indexing, these logs files are gzipped, and I figure it would take less
network bandwith to inject gzipped files directl.  is there a way to do
this? other then implementing my own SerlvetFilter or some such.

thanx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/does-solr-support-posting-gzipped-content-tp1733178p1733178.html
Sent from the Solr - User mailing list archive at Nabble.com.