OK . I guess I see it.  I am thinking of exposing the writes to the
properties file via an API.

say Context#persist(key,value);


This can write the data to the dataimport.properties.

You must be able to retrieve that value by ${dataimport.persist.<key>}

or through an API, Context.getPersistValue(key)

You can raise an issue and give a patch and we can get it committed

I guess this is what you wish to achieve

--Noble



On Wed, Dec 3, 2008 at 3:28 AM, Marc Sturlese <[EMAIL PROTECTED]> wrote:
>
> Do you mean the file used by dataimporthandler called dataimport.properties?
> If you mean this one it's writen at the end of the indexing proccess. The
> writen date will be used in the next indexation by delta-query to identify
> the new or modified rows from the database.
>
> What I am trying to do is instead of saving a timestamp save the last
> indexed id. Doing that, in the next execution I will start indexing from the
> last doc that was indexed in the previous indexation. But I am still a bit
> confused about how to do that...
>
> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>>
>> delta-import file?
>>
>>
>> On Wed, Dec 3, 2008 at 12:08 AM, Lance Norskog <[EMAIL PROTECTED]> wrote:
>>> Does the DIH delta feature rewrite the delta-import file for each set of
>>> rows? If it does not, that sounds like a bug/enhancement.
>>> Lance
>>>
>>> -----Original Message-----
>>> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:[EMAIL PROTECTED]
>>> Sent: Tuesday, December 02, 2008 8:51 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: DataImportHandler: Deleteing from index and db; lastIndexed
>>> id feature
>>>
>>> You can write the details to a file using a Transformer itself.
>>>
>>> It is wise to stick to the public API as far as possible. We will
>>> maintain back compat and your code will be usable w/ newer versions.
>>>
>>>
>>> On Tue, Dec 2, 2008 at 5:12 PM, Marc Sturlese <[EMAIL PROTECTED]>
>>> wrote:
>>>>
>>>> Thanks I really apreciate your help.
>>>>
>>>> I didn't explain myself so well in here:
>>>>
>>>>> 2.-This is probably my most difficult goal.
>>>>> Deltaimport reads a timestamp from the dataimport.properties and
>>>>> modify/add all documents from db wich were inserted after that date.
>>>>> What I want is to be able to save in the field the id of the last
>>>>> idexed doc. So in the next time I ejecute the indexer make it start
>>>>> indexing from that last indexed id doc.
>>>> You can use a Transformer to write something to the DB.
>>>> Context#getDataSource(String) for each row
>>>>
>>>> When I said:
>>>>
>>>>> be able to save in the field the id of the last idexed doc
>>>> I made a mistake, wanted to mean :
>>>>
>>>> be able to save in the file (dataimport.properties) the id of the last
>>>> indexed doc.
>>>> The point would be to do my own deltaquery indexing from the last doc
>>>> indexed id instead of the timestamp.
>>>> So I think this would not work in that case (it's my mistake because
>>>> of the bad explanation):
>>>>
>>>>>You can use a Transformer to write something to the DB.
>>>>>Context#getDataSource(String) for each row
>>>>
>>>> It is because I was saying:
>>>>> I think I should begin modifying the SolrWriter.java and
>>>>> DocBuilder.java.
>>>>> Creating functions like getStartTime, persistStartTime... for ID
>>>>> control
>>>>
>>>> I am in the correct direction?
>>>>  Sorry for my englis and thanks in advance
>>>>
>>>>
>>>> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>>>>>
>>>>> On Tue, Dec 2, 2008 at 3:01 PM, Marc Sturlese
>>>>> <[EMAIL PROTECTED]>
>>>>> wrote:
>>>>>>
>>>>>> Hey there,
>>>>>>
>>>>>> I have my dataimporthanlder almost completely configured. I am
>>>>>> missing three goals. I don't think I can reach them just via xml
>>>>>> conf or transformer and sqlEntitProcessor plugin. But need to be
>>>>>> sure of that.
>>>>>> If there's no other way I will hack some solr source classes, would
>>>>>> like to know the best way to do that. Once I have it solved, I can
>>>>>> upload or post the source in the forum in case someone think it can
>>>>>> be helpful.
>>>>>>
>>>>>> 1.- Every time I execute dataimporthandler (to index data from a
>>>>>> db), at the start time or end time I need to delete some expired
>>>>>> documents. I have to delete them from the database and from the
>>>>>> index. I know wich documents must be deleted because of a field in
>>>>>> the db that says it. Would not like to delete first all from DB or
>>>>>> first all from index but one from index and one from doc every time.
>>>>>
>>>>> You can override the init() destroy() of the SqlEntityProcessor and
>>>>> use it as the processor for the root entity. At this point you can
>>>>> run the necessary db queries and solr delete queries . look at
>>>>> Context#getSolrCore() and Context#getdataSource(String)
>>>>>
>>>>>
>>>>>> The "delete mark" is setted as an update in the db row so I think I
>>>>>> could use deltaImport. Don't know If deletedPkQuery is the way to do
>>>>>> that. Can not find so much information about how to make it work. As
>>>>>> deltaQuery modifies docs (delete old and insert new) I supose it
>>>>>> must be a easy way to do this just doing the delete and not the new
>>>>>> insert.
>>>>> deletedPkQuery does everything first. it runs the query and uses that
>>>>> to identify the deleted rows.
>>>>>>
>>>>>> 2.-This is probably my most difficult goal.
>>>>>> Deltaimport reads a timestamp from the dataimport.properties and
>>>>>> modify/add all documents from db wich were inserted after that date.
>>>>>> What I want is to be able to save in the field the id of the last
>>>>>> idexed doc. So in the next time I ejecute the indexer make it start
>>>>>> indexing from that last indexed id doc.
>>>>> You can use a Transformer to write something to the DB.
>>>>> Context#getDataSource(String) for each row
>>>>>
>>>>>> The point of doing this is that if I do a full import from a db with
>>>>>> lots of rows the app could encounter a problem in the middle of the
>>>>>> execution and abort the process. As deltaquey works I would have to
>>>>>> restart the execution from the begining. Having this new
>>>>>> functionality I could optimize the index and start from the last
>>>>>> indexed doc.
>>>>>> I think I should begin modifying the SolrWriter.java and
>>>>>> DocBuilder.java.
>>>>>> Creating functions like getStartTime, persistStartTime... for ID
>>>>>> control
>>>>>>
>>>>>> 3.-I commented before about this last point. I want to give boost to
>>>>>> doc fields at indexing time.
>>>>>>>>Adding fieldboost is a planned item.
>>>>>>
>>>>>>>>It must work as follows .
>>>>>>>>Add a special value $fieldBoost.<fieldname> to the row map
>>>>>>
>>>>>>>>And DocBuilder should respect that. You can raise a bug and we can
>>>>>>>>commit it soon.
>>>>>> How can I do to rise a bug?
>>>>> https://issues.apache.org/jira/secure/CreateIssue!default.jspa
>>>>>>
>>>>>> Thanks in advance
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-
>>>>>> db--lastIndexed-id-feature-tp20788755p20788755.html
>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> --Noble Paul
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db
>>>> --lastIndexed-id-feature-tp20788755p20790542.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> --Noble Paul
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20801932.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul

Reply via email to