The binary format just reduces overhead. in your case , all the data is in the big text field which is not compressed. But overall, the parsing is a lot faster for the binary format. So you see a perf boost
2010/1/27 Tim Terlegård <tim.terleg...@gmail.com>: > I have 6 fields. The text field is the biggest, it contains almost all > of the 5000 chars. > > /Tim > > 2010/1/27 Noble Paul നോബിള് नोब्ळ् <noble.p...@corp.aol.com>: >> how many fields are there in each doc? the binary format just reduces >> overhead. it does not touch/compress the payload >> >> 2010/1/27 Tim Terlegård <tim.terleg...@gmail.com>: >>> I have 3 millon documents, each having 5000 chars. The xml file is >>> about 15GB. The binary file is also about 15GB. >>> >>> I was a bit surprised about this. It doesn't bother me much though. At >>> least it performs better. >>> >>> /Tim >>> >>> 2010/1/27 Noble Paul നോബിള് नोब्ळ् <noble.p...@corp.aol.com>: >>>> if you write only a few docs you may not observe much difference in >>>> size. if you write large no:of docs you may observe a big difference. >>>> >>>> 2010/1/27 Tim Terlegård <tim.terleg...@gmail.com>: >>>>> I got the binary format to work perfectly now. Performance is better >>>>> than with xml. Thanks! >>>>> >>>>> Although, it doesn't look like a binary file is smaller in size than >>>>> an xml file? >>>>> >>>>> /Tim >>>>> >>>>> 2010/1/27 Noble Paul നോബിള് नोब्ळ् <noble.p...@corp.aol.com>: >>>>>> 2010/1/21 Tim Terlegård <tim.terleg...@gmail.com>: >>>>>>> Yes, it worked! Thank you very much. But do I need to use curl or can >>>>>>> I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't >>>>>>> use BinaryWriter then I don't know how to do this. >>>>>> if your data is serialized using JavaBinUpdateRequestCodec, you may >>>>>> POST it using curl. >>>>>> If you are writing directly , use CommonsHttpSolrServer >>>>>>> >>>>>>> /Tim >>>>>>> >>>>>>> 2010/1/20 Noble Paul നോബിള് नोब्ळ् <noble.p...@corp.aol.com>: >>>>>>>> 2010/1/20 Tim Terlegård <tim.terleg...@gmail.com>: >>>>>>>>>>>> BinaryRequestWriter does not read from a file and post it >>>>>>>>>>> >>>>>>>>>>> Is there any other way or is this use case not supported? I tried >>>>>>>>>>> this: >>>>>>>>>>> >>>>>>>>>>> $ curl <host>/solr/update/javabin -F stream.file=/tmp/data.bin >>>>>>>>>>> $ curl <host>/solr/update -F stream.body=' <commit />' >>>>>>>>>>> >>>>>>>>>>> Solr did read the file, because solr complained when the file wasn't >>>>>>>>>>> in the format the JavaBinUpdateRequestCodec expected. But no data is >>>>>>>>>>> added to the index for some reason. >>>>>>>>> >>>>>>>>>> how did you create the file /tmp/data.bin ? what is the format? >>>>>>>>> >>>>>>>>> I wrote this in the first email. It's in the javabin format (I think). >>>>>>>>> I did like this (groovy code): >>>>>>>>> >>>>>>>>> fieldId = new NamedList() >>>>>>>>> fieldId.add("name", "id") >>>>>>>>> fieldId.add("val", "9-0") >>>>>>>>> fieldId.add("boost", null) >>>>>>>>> fieldText = new NamedList() >>>>>>>>> fieldText.add("name", "text") >>>>>>>>> fieldText.add("val", "Some text") >>>>>>>>> fieldText.add("boost", null) >>>>>>>>> fieldNull = new NamedList() >>>>>>>>> fieldNull.add("boost", null) >>>>>>>>> doc = [fieldNull, fieldId, fieldText] >>>>>>>>> docs = [doc] >>>>>>>>> root = new NamedList() >>>>>>>>> root.add("docs", docs) >>>>>>>>> fos = new FileOutputStream("data.bin") >>>>>>>>> new JavaBinCodec().marshal(root, fos) >>>>>>>>> >>>>>>>>> /Tim >>>>>>>>> >>>>>>>> JavaBin is a format. >>>>>>>> use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest >>>>>>>> updateRequest, OutputStream os) >>>>>>>> >>>>>>>> The output of this can be posted to solr and it should work >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> ----------------------------------------------------- >>>>>>>> Noble Paul | Systems Architect| AOL | http://aol.com >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> ----------------------------------------------------- >>>>>> Noble Paul | Systems Architect| AOL | http://aol.com >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> ----------------------------------------------------- >>>> Noble Paul | Systems Architect| AOL | http://aol.com >>>> >>> >> >> >> >> -- >> ----------------------------------------------------- >> Noble Paul | Systems Architect| AOL | http://aol.com >> > -- ----------------------------------------------------- Noble Paul | Systems Architect| AOL | http://aol.com