dataimport.properties; configure writable location?

2009-05-20 Thread Wesley Small
In Solr 1.3, is there a setting that allows one to modified the where the
dataimport.properties file resides?

In a production environment, the solrconfig directory needs to be read-only.
I have observed that the DIH process works regards, but a whooping errors is
put in the logs when the dataimport.properties obviously cannot be
created/written to.

Thanks,
Wesley



Re: dataimport.properties; configure writable location?

2009-05-20 Thread Wesley Small
Is a place in a core's solrconfig, where one can set the directory/path
where the dataimport.properties file is written to?


On 5/20/09 2:09 PM, Giovanni De Stefano giovanni.destef...@gmail.com
wrote:

 Doh,
 
 can you please rephrase?
 
 Giovanni
 
 On Wed, May 20, 2009 at 3:47 PM, Wesley Small
 wesley.sm...@mtvstaff.comwrote:
 
 In Solr 1.3, is there a setting that allows one to modified the where the
 dataimport.properties file resides?
 
 In a production environment, the solrconfig directory needs to be
 read-only.
 I have observed that the DIH process works regards, but a whooping errors
 is
 put in the logs when the dataimport.properties obviously cannot be
 created/written to.
 
 Thanks,
 Wesley
 
 



Solr - clarification on date sortable fields

2009-04-21 Thread Wesley Small
I am sending this question out on behalf a college. Which needs a
clarification on solr indexing on  date and sortable fields.

We have declared a field date in schema.xml like below

field name=premierDate_dt type=date indexed=true  stored=true
multiValued=false default=NOW/

While indexing if I don't pass any value to this field like
premierDate_dt/ or premierDate_dt/premierDate_dt, I am  getting the
below error 

SEVERE:  org.apache.solr.common.SolrException: Invalid Date String:''
at  org.apache.solr.schema.DateField.parseMath(DateField.java:167)
at  org.apache.solr.schema.DateField.toInternal(DateField.java:138)
at  org.apache.solr.schema.FieldType.createField(FieldType.java:179)
at  org.apache.solr.schema.SchemaField.createField(SchemaField.java:93)
at  
 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:243)
at  
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProces
 sorFactory.java:58)
 

Instead if I remove the tag from the request, it is not giving any  issues.
The same behavious exist for sortable fields as well like sint, slong.  Is
there any work around we can make in schema file?

Or the request needs to be changed accordingly?

A  quick work around for this is declaring the fields as string. But the
limitation would be we can not perform any range search queries on these
fields..

Interestingly,f we replace with all zeros in the date (I.e.
premierDate_dt-00-00T00:00:00Z/premierDate_dt,
It gets indexed and the value in index is created as 0002-11-30T00:00:00.


Thanks.



DIH API for specifying a either specific or all configurations imported

2009-04-06 Thread Wesley Small
Good Morning,

Is there any way to specify or debug a specific DIH configuration via the
API/http request?

I have the following:

lst name=defaults
str name=configdih_pc_default_feed.xml/str
/lst
lst name=pc_cms_article
str name=configdih_pc_cms_article_feed.xml/str
/lst
lst name=pc_local_event
str name=configdih_pc_local_event_feed.xml/str
/lst

For example, is there any to specific only the pc_local_event be process
(imported)?

Another questions, if command=full-import, this should effectively mean that
all DIH configuration are executed in sequential order.  Is that correct?  I
am not seeing that behaviour at present.

Thanks,
Wesley



Re: DIH Date conversion from a source column skews time

2009-04-03 Thread Wesley Small
Okay, I will give that a try.

I could resolve this any other day by being able to execute the same XPATH
retrieval twice.  Why does the following not work:

field column=first_date_d
xpath=/add/doc/fie...@name='original_air_date_d'] /
field column=second_date_s
xpath=/add/doc/fie...@name='original_air_date_d'] /

When I do this, only the second_date_s will make it into the index.  I know
first_date_d instruction is valid but, it just disappears.

Any thoughts?

On 4/1/09 11:59 PM, Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com
wrote:

 I guess dateFormat does the job properly but the returned value is
 changed according to timezone.
 
 can y try this out add an extra field which converts the date to toString()
 
 field column=original_air_date_d_str
 template=${entityname.original_air_date_d}/
 this would add an extra field as string to the index
 
 
 
 On Wed, Apr 1, 2009 at 11:31 PM, Wesley Small wesley.sm...@mtvstaff.com
 wrote:
 Was there any follow up to this issue I found?  Is this a legitimate bug
 with the time of day changing?
 
 I could try to solve this by executing same xpath statement twice.
 
 field column=original_air_date_d
 xpath=/add/doc/fie...@name='original_air_date_d'] /
 
 field column=temp_original_air_date_s
 xpath=/add/doc/fie...@name='original_air_date_d'] /
 
 However, when I do that, the first field original_air_date_d does not make
 it into the index. Is seems that you cannot have two identical xpath
 statements in the data input config file. Is this by design?
 
 
 On 4/1/09 7:45 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote:
 
 I have noticed that setting a dynamic date field from source column changes
 the time within the date.  Can anyone confirm this?
 
 For example, the document I import has the following xml field.
 
 field name=original_air_date_d2002-12-18T00:00:00Z/field
 
 
 In my data-inport-config file I define the following instructions:
 
 field column=temp_original_air_date_s
 xpath=/add/doc/fie...@name='original_air_date_d'] /
 
 field column=original_air_year_s
 sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[-
 /.][0-9][0-9][- /.][0-
 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 /
 
 field column=original_air_date_d sourceColName=temp_original_air_date_s
 dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/
 
 
 What is set in my index is is the following:
 
 arr name=temp_original_air_date_s
 str2002-12-18T00:00:00Z/str
 /arr
 
 arr name=original_air_year_s
 str2002/str
 /arr
 
 arr name=original_air_date_d
 date2002-12-18T05:00:00Z/date
 /arr
 
 You'll notice that the hour (HH) in original_air_date_d changes is set to
 05.  It should still be 00. I have noticed that it changes to either 04 or
 05 in all cases within my index.
 
 In my schema the dynamic field *_d
 dynamicField name=*_d type=date indexed=true stored=true/
 
 Thanks,
 Wesley.
 
 
 
 
 
 
 
 --
 --Noble Paul
 



DIH Date conversion from a source column skews time

2009-04-01 Thread Wesley Small
I have noticed that setting a dynamic date field from source column changes
the time within the date.  Can anyone confirm this?

For example, the document I import has the following xml field.

field name=original_air_date_d2002-12-18T00:00:00Z/field


In my data-inport-config file I define the following instructions:

field column=temp_original_air_date_s
xpath=/add/doc/fie...@name='original_air_date_d'] /

field column=original_air_year_s
sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[-
/.][0-9][0-9][- /.][0-
9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 /

field column=original_air_date_d sourceColName=temp_original_air_date_s
dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/


What is set in my index is is the following:

arr name=temp_original_air_date_s
str2002-12-18T00:00:00Z/str
/arr

arr name=original_air_year_s
str2002/str
/arr  

arr name=original_air_date_d
date2002-12-18T05:00:00Z/date
/arr

You'll notice that the hour (HH) in original_air_date_d changes is set to
05.  It should still be 00. I have noticed that it changes to either 04 or
05 in all cases within my index.

In my schema the dynamic field *_d
dynamicField name=*_d type=date indexed=true stored=true/

Thanks,
Wesley.



Re: DIH Date conversion from a source column skews time

2009-04-01 Thread Wesley Small
Was there any follow up to this issue I found?  Is this a legitimate bug
with the time of day changing?

I could try to solve this by executing same xpath statement twice.

field column=original_air_date_d
xpath=/add/doc/fie...@name='original_air_date_d'] /

field column=temp_original_air_date_s
xpath=/add/doc/fie...@name='original_air_date_d'] /

However, when I do that, the first field original_air_date_d does not make
it into the index. Is seems that you cannot have two identical xpath
statements in the data input config file. Is this by design?


On 4/1/09 7:45 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote:

 I have noticed that setting a dynamic date field from source column changes
 the time within the date.  Can anyone confirm this?
 
 For example, the document I import has the following xml field.
 
 field name=original_air_date_d2002-12-18T00:00:00Z/field
 
 
 In my data-inport-config file I define the following instructions:
 
 field column=temp_original_air_date_s
 xpath=/add/doc/fie...@name='original_air_date_d'] /
 
 field column=original_air_year_s
 sourceColName=temp_original_air_date_s regex=([0-9][0-9][0-9][0-9])[-
 /.][0-9][0-9][- /.][0-
 9][0-9][T][0-9][0-9][:][0-9][0-9][:][0-9][0-9][Z] replaceWith=$1 /
 
 field column=original_air_date_d sourceColName=temp_original_air_date_s
 dateTimeFormat=-MM-dd'T'HH:mm:ss'Z'/
 
 
 What is set in my index is is the following:
 
 arr name=temp_original_air_date_s
 str2002-12-18T00:00:00Z/str
 /arr
 
 arr name=original_air_year_s
 str2002/str
 /arr 
 
 arr name=original_air_date_d
 date2002-12-18T05:00:00Z/date
 /arr
 
 You'll notice that the hour (HH) in original_air_date_d changes is set to
 05.  It should still be 00. I have noticed that it changes to either 04 or
 05 in all cases within my index.
 
 In my schema the dynamic field *_d
 dynamicField name=*_d type=date indexed=true stored=true/
 
 Thanks,
 Wesley.
 
 



Re: DIH; Hardcode field value/replacement based on source column

2009-04-01 Thread Wesley Small
Thanks for the feedback.  The templateTransformer is pretty straightforward
solution. Perfect.

Wesley.


 
 On 4/1/09 12:14 AM, Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com wrote:
 
 use TemplateTransformer
 field column=content_type_s template=Video /
 
 
 
 On Tue, Mar 31, 2009 at 9:20 PM, Wesley Small wesley.sm...@mtvstaff.com
 wrote:
 I am trying to find a clean way to *hardcode* a field/column to a specific
 value during the DIH process.  It does seems to be possible but I am getting
 an slightly invalid constant value in my index.
 
 field column=content_type_s sourceColName=title_t regex=(.*)
 replaceWith=Video /
 
 However, the value in the index was set to VideoVideo for all documents.
 
 Any idea why this DIH instruction would see constant value appear twice??
 
 Thanks,
 Wesley.
 
 
 
 
 
 
 --
 --Noble Paul
 



DIH; Hardcode field value/replacement based on source column

2009-03-31 Thread Wesley Small
I am trying to find a clean way to *hardcode* a field/column to a specific
value during the DIH process.  It does seems to be possible but I am getting
an slightly invalid constant value in my index.

field column=content_type_s sourceColName=title_t regex=(.*)
replaceWith=Video /

However, the value in the index was set to VideoVideo for all documents.

Any idea why this DIH instruction would see constant value appear twice??

Thanks,
Wesley.




Test

2009-03-27 Thread Wesley Small
Sorry, I am having trouble sending a message to this Distribution list. This
is a test.



Re: Solr 1.3; Data Import w/ Dynamic Fields

2009-03-12 Thread Wesley Small
I was successful at distributing the Solr-1.4-DEV data import functionality
within the Solr 1.3 war.

1. Copy the data import’s src directory from 1.4 to 1.3.
2. Made sure to used the data import’s build.xml already existing in Solr
1.3
3. Commented out all code within #SolrWriter.rollback method
4. Commented out the following import statements from #SolrWriter
#import org.apache.solr.update.RollbackUpdateCommand;
5. Copied required libraries for logging from 1.4/lib to 1.3/lib
slf4j-api-1.5.5.jar
slf4j-jdk14-1.5.5.jar

I was planning on replacing the Solr 1.4 logging scheme to the style in Solr
1.3, but that was unnecessary work.

Continuing my testing with this customized distributing.

Thanks again,
Wesley.



On 3/11/09 6:35 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

 On Wed, Mar 11, 2009 at 4:01 PM, Noble Paul നോബിള്‍ नोब्ळ् 
 noble.p...@gmail.com wrote:
 
  I guess you can take the trunk and comment out the contents of
  SolrWriter#rollback() and it should work with Solr1.3
 
 
 I agree. Rollback is the only feature which depends on enhancements in
 Solr/Lucene libraries. So if you remove this feature, everything else should
 work fine with 1.3
 
 --
 Regards,
 Shalin Shekhar Mangar.
 



Solr 1.3; Data Import w/ Dynamic Fields

2009-03-11 Thread Wesley Small
Good morning,

I reviewed a Solr Patch-742, which corrects an issue with the data import
process properly ingesting/commiting (solr add xml) document with dynamic
fields. 

Is this fix available for Solr 1.3 or is there a known work around?

Cheers,
Wesley


Re: Solr 1.3; Data Import w/ Dynamic Fields

2009-03-11 Thread Wesley Small
Thanks for the feedback Shalin.  I will investigate the backport of this 1.4
fix into 1.3.Do you know of any other subsequent patches related to the
data import and dynamic fields that I also should located and backport as
well?  I just ask if you happen to have this information handy.

I am reaching here, but I would like your opinion.  Do you believe it is
conceivable at all port the entire data import functionality from the latest
1.4-dev nightly build and manually merge this with the stable 1.3 release?
On 3/11/09 5:26 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

 On Wed, Mar 11, 2009 at 2:55 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:
 
  On Wed, Mar 11, 2009 at 2:28 PM, Wesley Small
 wesley.sm...@mtvstaff.comwrote:
 
  Good morning,
 
  I reviewed a Solr Patch-742, which corrects an issue with the data import
  process properly ingesting/commiting (solr add xml) document with dynamic
  fields.
 
  Is this fix available for Solr 1.3 or is there a known work around?
 
 
  Unfortunately, no. The fix is in trunk but the trunk DataImportHandler uses
  a new rollback operation which is not supported by Solr 1.3 release.
 
 
 However you should be able to backport the changes in SOLR-742 to Solr 1.3
 code.
 
 --
 Regards,
 Shalin Shekhar Mangar.
 



Re: Solr 1.3; Data Import w/ Dynamic Fields

2009-03-11 Thread Wesley Small
I attempted a backport of Patch-742 on Solr-1.3.  You can see the results
below with Hunk failures.

Is there specific  method to obtain a list of patches may that occurred
specific to the data import functionality prior to PATCH-742.  I suppose I
would need to ensure that these specific data import files
(DataImporter.java, DataConfig.java and DocBuilder.java) are at the correct
revision before applying PATCH-742.



-sh-3.1$ pwd
/home/smallwes/projects/solr/downloads/apache-solr-1.3.0

-sh-3.1$ patch -p 0 -i ../SOLR-742.patch --dry-run
patching file 
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D
ataImporter.java
Hunk #1 FAILED at 95.
Hunk #2 FAILED at 112.
Hunk #3 FAILED at 123.
Hunk #4 succeeded at 189 (offset -5 lines).
Hunk #5 FAILED at 227.
4 out of 5 hunks FAILED -- saving rejects to file
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D
ataImporter.java.rej
patching file 
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D
ataConfig.java
Hunk #3 FAILED at 130.
Hunk #4 FAILED at 145.
Hunk #5 FAILED at 158.
3 out of 5 hunks FAILED -- saving rejects to file
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D
ataConfig.java.rej
patching file 
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D
ocBuilder.java
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 331.
Hunk #3 FAILED at 368.
Hunk #4 FAILED at 402.
Hunk #5 succeeded at 580 (offset 1 line).
4 out of 5 hunks FAILED -- saving rejects to file
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/D
ocBuilder.java.rej



Regards,
Wesley.


On 3/11/09 6:07 AM, Small, Wesley wesley.sm...@mtvstaff.com wrote:

 Thanks for the feedback Shalin.  I will investigate the backport of this 1.4
 fix into 1.3.Do you know of any other subsequent patches related to the
 data import and dynamic fields that I also should located and backport as
 well?  I just ask if you happen to have this information handy.
 
 I am reaching here, but I would like your opinion.  Do you believe it is
 conceivable at all port the entire data import functionality from the latest
 1.4-dev nightly build and manually merge this with the stable 1.3 release?
 On 3/11/09 5:26 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote:
 
  On Wed, Mar 11, 2009 at 2:55 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
   On Wed, Mar 11, 2009 at 2:28 PM, Wesley Small
  wesley.sm...@mtvstaff.comwrote:
  
   Good morning,
  
   I reviewed a Solr Patch-742, which corrects an issue with the data
import
   process properly ingesting/commiting (solr add xml) document with
 dynamic
   fields.
  
   Is this fix available for Solr 1.3 or is there a known work 
around?
  
  
   Unfortunately, no. The fix is in trunk but the trunk DataImportHandler
uses
   a new rollback operation which is not supported by Solr 1.3 release.
  
 
  However you should be able to backport the changes in SOLR-742 to Solr 1.3
  code.
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
 



DIH Solr1.4

2009-03-06 Thread Wesley Small
I am evaluating the DIH in Solr 1.4-DEV and am receiving a Null Pointer
Exception when the import process begins.  Here are the details:


[LOG MESSAGE]
2009-03-06 11:06:04,635 ERROR [STDERR] (http-0.0.0.0-20080-Processor3) Mar
6, 2009 11:06:04 AM org.apache.solr.handler.dataimport.DataImporter
doFullImport
SEVERE: Full Import failed
java.lang.NullPointerException
at java.util.Calendar.setTime(Calendar.java:1032)
at java.text.SimpleDateFormat.format(SimpleDateFormat.java:785)
at java.text.SimpleDateFormat.format(SimpleDateFormat.java:778)
at java.text.DateFormat.format(DateFormat.java:314)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147)
 ...

[SRC: DocBuilder.java:147]
lastIndexTimeProps.setProperty(LAST_INDEX_KEY,
DataImporter.DATE_TIME_FORMAT.get().format(dataImporter.getIndexStartTime())
);

I can see that the indexing start time is indeed stored in the
dataimport.properties.

Is there an additional configuration that needs to be set for the
DataImporter.DATE_TIME_FORMAT to correctly execute?

Cheers



Re: DIH Solr1.4

2009-03-06 Thread Wesley Small
I am using:

Solr Implementation Version: 1.4-dev 750448 - smallwes - 2009-03-05 08:01:30


On 3/6/09 11:35 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

 Which nightly build are you using? You can see this in the INFO page on solr
 admin.