dataimport.properties; configure writable location?
In Solr 1.3, is there a setting that allows one to modified the where the dataimport.properties file resides? In a production environment, the solrconfig directory needs to be read-only. I have observed that the DIH process works regards, but a whooping errors is put in the logs when the dataimport.properties obviously cannot be created/written to. Thanks, Wesley
Re: dataimport.properties; configure writable location?
Is a place in a core's solrconfig, where one can set the directory/path where the dataimport.properties file is written to? On 5/20/09 2:09 PM, Giovanni De Stefano giovanni.destef...@gmail.com wrote: Doh, can you please rephrase? Giovanni On Wed, May 20, 2009 at 3:47 PM, Wesley Small wesley.sm...@mtvstaff.comwrote: In Solr 1.3, is there a setting that allows one to modified the where the dataimport.properties file resides? In a production environment, the solrconfig directory needs to be read-only. I have observed that the DIH process works regards, but a whooping errors is put in the logs when the dataimport.properties obviously cannot be created/written to. Thanks, Wesley
Solr - clarification on date sortable fields
I am sending this question out on behalf a college. Which needs a clarification on solr indexing on date and sortable fields. We have declared a field date in schema.xml like below field name=premierDate_dt type=date indexed=true stored=true multiValued=false default=NOW/ While indexing if I don't pass any value to this field like premierDate_dt/ or premierDate_dt/premierDate_dt, I am getting the below error SEVERE: org.apache.solr.common.SolrException: Invalid Date String:'' at org.apache.solr.schema.DateField.parseMath(DateField.java:167) at org.apache.solr.schema.DateField.toInternal(DateField.java:138) at org.apache.solr.schema.FieldType.createField(FieldType.java:179) at org.apache.solr.schema.SchemaField.createField(SchemaField.java:93) at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:243) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProces sorFactory.java:58) Instead if I remove the tag from the request, it is not giving any issues. The same behavious exist for sortable fields as well like sint, slong. Is there any work around we can make in schema file? Or the request needs to be changed accordingly? A quick work around for this is declaring the fields as string. But the limitation would be we can not perform any range search queries on these fields.. Interestingly,f we replace with all zeros in the date (I.e. premierDate_dt-00-00T00:00:00Z/premierDate_dt, It gets indexed and the value in index is created as 0002-11-30T00:00:00. Thanks.
DIH API for specifying a either specific or all configurations imported
Good Morning, Is there any way to specify or debug a specific DIH configuration via the API/http request? I have the following: lst name=defaults str name=configdih_pc_default_feed.xml/str /lst lst name=pc_cms_article str name=configdih_pc_cms_article_feed.xml/str /lst lst name=pc_local_event str name=configdih_pc_local_event_feed.xml/str /lst For example, is there any to specific only the pc_local_event be process (imported)? Another questions, if command=full-import, this should effectively mean that all DIH configuration are executed in sequential order. Is that correct? I am not seeing that behaviour at present. Thanks, Wesley
Re: DIH Date conversion from a source column skews time
Okay, I will give that a try. I could resolve this any other day by being able to execute the same XPATH retrieval twice. Why does the following not work: field column=first_date_d xpath=/add/doc/fie...@name='original_air_date_d'] / field column=second_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / When I do this, only the second_date_s will make it into the index. I know first_date_d instruction is valid but, it just disappears. Any thoughts? On 4/1/09 11:59 PM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: I guess dateFormat does the job properly but the returned value is changed according to timezone. can y try this out add an extra field which converts the date to toString() field column=original_air_date_d_str template=${entityname.original_air_date_d}/ this would add an extra field as string to the index On Wed, Apr 1, 2009 at 11:31 PM, Wesley Small wesley.sm...@mtvstaff.com wrote: Was there any follow up to this issue I found? Is this a legitimate bug with the time of day changing? I could try to solve this by executing same xpath statement twice. field column=original_air_date_d xpath=/add/doc/fie...@name='original_air_date_d'] / field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / However, when I do that, the first field original_air_date_d does not make it into the index. Is seems that you cannot have two identical xpath statements in the data input config file. Is this by design? On 4/1/09 7:45 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote: I have noticed that setting a dynamic date field from source column changes the time within the date. Can anyone confirm this? For example, the document I import has the following xml field. field name=original_air_date_d2002-12-18T00:00:00Z/field In my data-inport-config file I define the following instructions: field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / field column=original_air_year_s sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[- /.][0-9][0-9][- /.][0- 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 / field column=original_air_date_d sourceColName=temp_original_air_date_s dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/ What is set in my index is is the following: arr name=temp_original_air_date_s str2002-12-18T00:00:00Z/str /arr arr name=original_air_year_s str2002/str /arr arr name=original_air_date_d date2002-12-18T05:00:00Z/date /arr You'll notice that the hour (HH) in original_air_date_d changes is set to 05. It should still be 00. I have noticed that it changes to either 04 or 05 in all cases within my index. In my schema the dynamic field *_d dynamicField name=*_d type=date indexed=true stored=true/ Thanks, Wesley. -- --Noble Paul
DIH Date conversion from a source column skews time
I have noticed that setting a dynamic date field from source column changes the time within the date. Can anyone confirm this? For example, the document I import has the following xml field. field name=original_air_date_d2002-12-18T00:00:00Z/field In my data-inport-config file I define the following instructions: field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / field column=original_air_year_s sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[- /.][0-9][0-9][- /.][0- 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 / field column=original_air_date_d sourceColName=temp_original_air_date_s dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/ What is set in my index is is the following: arr name=temp_original_air_date_s str2002-12-18T00:00:00Z/str /arr arr name=original_air_year_s str2002/str /arr arr name=original_air_date_d date2002-12-18T05:00:00Z/date /arr You'll notice that the hour (HH) in original_air_date_d changes is set to 05. It should still be 00. I have noticed that it changes to either 04 or 05 in all cases within my index. In my schema the dynamic field *_d dynamicField name=*_d type=date indexed=true stored=true/ Thanks, Wesley.
Re: DIH Date conversion from a source column skews time
Was there any follow up to this issue I found? Is this a legitimate bug with the time of day changing? I could try to solve this by executing same xpath statement twice. field column=original_air_date_d xpath=/add/doc/fie...@name='original_air_date_d'] / field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / However, when I do that, the first field original_air_date_d does not make it into the index. Is seems that you cannot have two identical xpath statements in the data input config file. Is this by design? On 4/1/09 7:45 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote: I have noticed that setting a dynamic date field from source column changes the time within the date. Can anyone confirm this? For example, the document I import has the following xml field. field name=original_air_date_d2002-12-18T00:00:00Z/field In my data-inport-config file I define the following instructions: field column=temp_original_air_date_s xpath=/add/doc/fie...@name='original_air_date_d'] / field column=original_air_year_s sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[- /.][0-9][0-9][- /.][0- 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 / field column=original_air_date_d sourceColName=temp_original_air_date_s dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/ What is set in my index is is the following: arr name=temp_original_air_date_s str2002-12-18T00:00:00Z/str /arr arr name=original_air_year_s str2002/str /arr arr name=original_air_date_d date2002-12-18T05:00:00Z/date /arr You'll notice that the hour (HH) in original_air_date_d changes is set to 05. It should still be 00. I have noticed that it changes to either 04 or 05 in all cases within my index. In my schema the dynamic field *_d dynamicField name=*_d type=date indexed=true stored=true/ Thanks, Wesley.
Re: DIH; Hardcode field value/replacement based on source column
Thanks for the feedback. The templateTransformer is pretty straightforward solution. Perfect. Wesley. On 4/1/09 12:14 AM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: use TemplateTransformer field column=content_type_s template=Video / On Tue, Mar 31, 2009 at 9:20 PM, Wesley Small wesley.sm...@mtvstaff.com wrote: I am trying to find a clean way to *hardcode* a field/column to a specific value during the DIH process. It does seems to be possible but I am getting an slightly invalid constant value in my index. field column=content_type_s sourceColName=title_t regex=(.*) replaceWith=Video / However, the value in the index was set to VideoVideo for all documents. Any idea why this DIH instruction would see constant value appear twice?? Thanks, Wesley. -- --Noble Paul
DIH; Hardcode field value/replacement based on source column
I am trying to find a clean way to *hardcode* a field/column to a specific value during the DIH process. It does seems to be possible but I am getting an slightly invalid constant value in my index. field column=content_type_s sourceColName=title_t regex=(.*) replaceWith=Video / However, the value in the index was set to VideoVideo for all documents. Any idea why this DIH instruction would see constant value appear twice?? Thanks, Wesley.
Test
Sorry, I am having trouble sending a message to this Distribution list. This is a test.
Re: Solr 1.3; Data Import w/ Dynamic Fields
I was successful at distributing the Solr-1.4-DEV data import functionality within the Solr 1.3 war. 1. Copy the data import’s src directory from 1.4 to 1.3. 2. Made sure to used the data import’s build.xml already existing in Solr 1.3 3. Commented out all code within #SolrWriter.rollback method 4. Commented out the following import statements from #SolrWriter #import org.apache.solr.update.RollbackUpdateCommand; 5. Copied required libraries for logging from 1.4/lib to 1.3/lib slf4j-api-1.5.5.jar slf4j-jdk14-1.5.5.jar I was planning on replacing the Solr 1.4 logging scheme to the style in Solr 1.3, but that was unnecessary work. Continuing my testing with this customized distributing. Thanks again, Wesley. On 3/11/09 6:35 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Mar 11, 2009 at 4:01 PM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: I guess you can take the trunk and comment out the contents of SolrWriter#rollback() and it should work with Solr1.3 I agree. Rollback is the only feature which depends on enhancements in Solr/Lucene libraries. So if you remove this feature, everything else should work fine with 1.3 -- Regards, Shalin Shekhar Mangar.
Solr 1.3; Data Import w/ Dynamic Fields
Good morning, I reviewed a Solr Patch-742, which corrects an issue with the data import process properly ingesting/commiting (solr add xml) document with dynamic fields. Is this fix available for Solr 1.3 or is there a known work around? Cheers, Wesley
Re: Solr 1.3; Data Import w/ Dynamic Fields
Thanks for the feedback Shalin. I will investigate the backport of this 1.4 fix into 1.3.Do you know of any other subsequent patches related to the data import and dynamic fields that I also should located and backport as well? I just ask if you happen to have this information handy. I am reaching here, but I would like your opinion. Do you believe it is conceivable at all port the entire data import functionality from the latest 1.4-dev nightly build and manually merge this with the stable 1.3 release? On 3/11/09 5:26 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Mar 11, 2009 at 2:55 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Mar 11, 2009 at 2:28 PM, Wesley Small wesley.sm...@mtvstaff.comwrote: Good morning, I reviewed a Solr Patch-742, which corrects an issue with the data import process properly ingesting/commiting (solr add xml) document with dynamic fields. Is this fix available for Solr 1.3 or is there a known work around? Unfortunately, no. The fix is in trunk but the trunk DataImportHandler uses a new rollback operation which is not supported by Solr 1.3 release. However you should be able to backport the changes in SOLR-742 to Solr 1.3 code. -- Regards, Shalin Shekhar Mangar.
Re: Solr 1.3; Data Import w/ Dynamic Fields
I attempted a backport of Patch-742 on Solr-1.3. You can see the results below with Hunk failures. Is there specific method to obtain a list of patches may that occurred specific to the data import functionality prior to PATCH-742. I suppose I would need to ensure that these specific data import files (DataImporter.java, DataConfig.java and DocBuilder.java) are at the correct revision before applying PATCH-742. -sh-3.1$ pwd /home/smallwes/projects/solr/downloads/apache-solr-1.3.0 -sh-3.1$ patch -p 0 -i ../SOLR-742.patch --dry-run patching file contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D ataImporter.java Hunk #1 FAILED at 95. Hunk #2 FAILED at 112. Hunk #3 FAILED at 123. Hunk #4 succeeded at 189 (offset -5 lines). Hunk #5 FAILED at 227. 4 out of 5 hunks FAILED -- saving rejects to file contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D ataImporter.java.rej patching file contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D ataConfig.java Hunk #3 FAILED at 130. Hunk #4 FAILED at 145. Hunk #5 FAILED at 158. 3 out of 5 hunks FAILED -- saving rejects to file contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D ataConfig.java.rej patching file contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D ocBuilder.java Hunk #1 FAILED at 17. Hunk #2 FAILED at 331. Hunk #3 FAILED at 368. Hunk #4 FAILED at 402. Hunk #5 succeeded at 580 (offset 1 line). 4 out of 5 hunks FAILED -- saving rejects to file contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D ocBuilder.java.rej Regards, Wesley. On 3/11/09 6:07 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote: Thanks for the feedback Shalin. I will investigate the backport of this 1.4 fix into 1.3.Do you know of any other subsequent patches related to the data import and dynamic fields that I also should located and backport as well? I just ask if you happen to have this information handy. I am reaching here, but I would like your opinion. Do you believe it is conceivable at all port the entire data import functionality from the latest 1.4-dev nightly build and manually merge this with the stable 1.3 release? On 3/11/09 5:26 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Mar 11, 2009 at 2:55 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Mar 11, 2009 at 2:28 PM, Wesley Small wesley.sm...@mtvstaff.comwrote: Good morning, I reviewed a Solr Patch-742, which corrects an issue with the data import process properly ingesting/commiting (solr add xml) document with dynamic fields. Is this fix available for Solr 1.3 or is there a known work around? Unfortunately, no. The fix is in trunk but the trunk DataImportHandler uses a new rollback operation which is not supported by Solr 1.3 release. However you should be able to backport the changes in SOLR-742 to Solr 1.3 code. -- Regards, Shalin Shekhar Mangar.
DIH Solr1.4
I am evaluating the DIH in Solr 1.4-DEV and am receiving a Null Pointer Exception when the import process begins. Here are the details: [LOG MESSAGE] 2009-03-06 11:06:04,635 ERROR [STDERR] (http-0.0.0.0-20080-Processor3) Mar 6, 2009 11:06:04 AM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed java.lang.NullPointerException at java.util.Calendar.setTime(Calendar.java:1032) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:785) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:778) at java.text.DateFormat.format(DateFormat.java:314) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) ... [SRC: DocBuilder.java:147] lastIndexTimeProps.setProperty(LAST_INDEX_KEY, DataImporter.DATE_TIME_FORMAT.get().format(dataImporter.getIndexStartTime()) ); I can see that the indexing start time is indeed stored in the dataimport.properties. Is there an additional configuration that needs to be set for the DataImporter.DATE_TIME_FORMAT to correctly execute? Cheers
Re: DIH Solr1.4
I am using: Solr Implementation Version: 1.4-dev 750448 - smallwes - 2009-03-05 08:01:30 On 3/6/09 11:35 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Which nightly build are you using? You can see this in the INFO page on solr admin.