Re: how to index 20 MB plain-text xml

2014-03-31 Thread Floyd Wu
Hi Alex, Thanks for your responding. Personally I don't want to feed these big xml to solr. But users wants. I'll try your suggestions later. Many thanks. Floyd 2014-03-31 13:44 GMT+08:00 Alexandre Rafalovitch arafa...@gmail.com: Without digging too deep into why exactly this is happening,

Re: how to index 20 MB plain-text xml

2014-03-31 Thread primoz . skale
...@gmail.com To: solr-user@lucene.apache.org Date: 31.03.2014 08:18 Subject:Re: how to index 20 MB plain-text xml Hi Alex, Thanks for your responding. Personally I don't want to feed these big xml to solr. But users wants. I'll try your suggestions later. Many thanks. Floyd 2014-03

Re: how to index 20 MB plain-text xml

2014-03-31 Thread Upayavira
Tell the user they can't have! Or, write a small app that reads in their XML in one go, and pushes it in parts to Solr. Generally, I'd say letting a user hit Solr directly is a bad thing - especially a user who doesn't know the details of how Solr works. Upayavira On Mon, Mar 31, 2014, at 07:17

Re: how to index 20 MB plain-text xml

2014-03-31 Thread Floyd Wu
Hi Upayavira, User don't hit solr directly, the search documents through my application. The application is a entrance for user to upload documents and then indexed by solr. the situation is they upload a plain-text, something like dictionary. You know, that dictionary is something big. I'm trying

Re: how to index 20 MB plain-text xml

2014-03-31 Thread Alexandre Rafalovitch
If you have an application, why are you sending XML documents to Solr? Can't you convert it to any other format and then send them in batches? Or even if it is XML, just bite and send in 100 document batches. Or in smaller batches and use auto-commit settings I mentioned earlier. Regards,

how to index 20 MB plain-text xml

2014-03-30 Thread Floyd Wu
I have many plain text xml that I transfer to form of solr xml format. But every time I send them to solr, I hit OOM exception. How to configure solr to eat these big xml? Please guide me a way. Thanks floyd

Re: how to index 20 MB plain-text xml

2014-03-30 Thread Alexandre Rafalovitch
Without digging too deep into why exactly this is happening, here are the general options: 0. Are you actually committing? Check the messages in the logs and see if the records show up when you expect them too. 1. Are you actually trying to feed 20Mb file to Solr? Maybe it's HTTP buffer that's