Thanks Erick, I expected to hear the dreaded word "programming" at some point and I guess that point has arrived. Now that I know where and what to tinker with.....
And I should have said 4.10 below, not 5.0. On Sep 22, 2014, at 4:44 PM, Erick Erickson <erickerick...@gmail.com> wrote: > I think this'll help: > > http://wiki.apache.org/solr/ScriptUpdateProcessor > > Essentially, each time a document comes in to Solr, > this will get invoked on it. You'll have to do some > fiddling to get it right, you have to remove the field from > the doc and transform it then put it back. None of this > is hard, but it'll require a bit of programming. Fortunately > not too much..... > > Best, > Erick > > On Mon, Sep 22, 2014 at 1:16 PM, Manohar Kanuri <s...@kanuri.org> wrote: >> Hello, >> >> I am a non-techie who decided to download and install Solr 5.0 to parse data >> for my community activism. Got it installed and running, updated the >> example schema and installation with a bunch of CSV data. And went back to >> deal with the first of two fields I deferred till later - dates and location >> data. >> >> The CSV data file for Jan - August 2014 is about 650mb with about 1.25 >> million records/rows. I split it into 5 pieces and went changed MM/DD/YYYY >> HH:MM:SS AM/PM to the YYYY-MM-DDTHH:MM:SSZ format required by Solr, using >> TextWrangler. Which is what I know and a step up from trying to use Mac >> Numbers spreadsheet which does it very easily but I will have to break it >> into pieces smaller than 25-30mb. Random fields can get updated months after >> the record was created so I have to find an easier way than break the CSV >> file into smaller bits and reformat manually. Each record/row has 4 date >> fields so potentially there are upto 5 million fields to be reformatted in 8 >> months worth of data.. >> >> I did a Google search (didn't see a Solr search page) on the mailing list >> archives and the internet, but seems like my question is either too simple >> and/or it's staring me in the face and I'm just missing it: Is there a >> simple way to reformat the dates to Solr-style in a 650mb-1gig CSV file? Or, >> ideally, have the dates and times automatically reformatted as the Solr >> index gets updated the latest data (I recall reading this was not possible). >> Is there a widget/gadget/gizmo/script that would do this? >> >> thanks, >> manohar