Thanks Erick,

I expected to hear the dreaded word "programming" at some point and I guess 
that point has arrived. Now that I know where and what to tinker with..... 

And I should have said 4.10 below, not 5.0.

On Sep 22, 2014, at 4:44 PM, Erick Erickson <erickerick...@gmail.com> wrote:

> I think this'll help:
> 
> http://wiki.apache.org/solr/ScriptUpdateProcessor
> 
> Essentially, each time a document comes in to Solr,
> this will get invoked on it. You'll have to do some
> fiddling to get it right, you have to remove the field from
> the doc and transform it then put it back. None of this
> is hard, but it'll require a bit of programming. Fortunately
> not too much.....
> 
> Best,
> Erick
> 
> On Mon, Sep 22, 2014 at 1:16 PM, Manohar Kanuri <s...@kanuri.org> wrote:
>> Hello,
>> 
>> I am a non-techie who decided to download and install Solr 5.0 to parse data 
>>  for my community activism. Got it installed and running, updated the 
>> example schema and installation with a bunch of CSV data. And went back to 
>> deal with the first of two fields I deferred till later - dates and location 
>> data.
>> 
>> The CSV data file for Jan - August 2014 is about 650mb with about 1.25 
>> million records/rows. I split it into 5 pieces and went changed MM/DD/YYYY 
>> HH:MM:SS AM/PM to the YYYY-MM-DDTHH:MM:SSZ format required by Solr, using 
>> TextWrangler. Which is what I know and a step up from trying to use Mac 
>> Numbers spreadsheet which does it very easily but I will have to break it 
>> into pieces smaller than 25-30mb. Random fields can get updated months after 
>> the record was created so I have to find an easier way than break the CSV 
>> file into smaller bits and reformat manually. Each record/row has 4 date 
>> fields so potentially there are upto 5 million fields to be reformatted in 8 
>> months worth of data..
>> 
>> I did a Google search (didn't see a Solr search page) on the mailing list 
>> archives and the internet, but seems like my question is either too simple 
>> and/or it's staring me in the face and I'm just missing it:  Is there a 
>> simple way to reformat the dates to Solr-style in a 650mb-1gig CSV file? Or, 
>> ideally, have the dates and times automatically reformatted as the Solr 
>> index gets updated the latest data (I recall reading this was not possible). 
>> Is there a widget/gadget/gizmo/script that would do this?
>> 
>> thanks,
>> manohar

Reply via email to