OK . I guess I see it. I am thinking of exposing the writes to the properties file via an API.
say Context#persist(key,value); This can write the data to the dataimport.properties. You must be able to retrieve that value by ${dataimport.persist.<key>} or through an API, Context.getPersistValue(key) You can raise an issue and give a patch and we can get it committed I guess this is what you wish to achieve --Noble On Wed, Dec 3, 2008 at 3:28 AM, Marc Sturlese <[EMAIL PROTECTED]> wrote: > > Do you mean the file used by dataimporthandler called dataimport.properties? > If you mean this one it's writen at the end of the indexing proccess. The > writen date will be used in the next indexation by delta-query to identify > the new or modified rows from the database. > > What I am trying to do is instead of saving a timestamp save the last > indexed id. Doing that, in the next execution I will start indexing from the > last doc that was indexed in the previous indexation. But I am still a bit > confused about how to do that... > > Noble Paul നോബിള് नोब्ळ् wrote: >> >> delta-import file? >> >> >> On Wed, Dec 3, 2008 at 12:08 AM, Lance Norskog <[EMAIL PROTECTED]> wrote: >>> Does the DIH delta feature rewrite the delta-import file for each set of >>> rows? If it does not, that sounds like a bug/enhancement. >>> Lance >>> >>> -----Original Message----- >>> From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] >>> Sent: Tuesday, December 02, 2008 8:51 AM >>> To: solr-user@lucene.apache.org >>> Subject: Re: DataImportHandler: Deleteing from index and db; lastIndexed >>> id feature >>> >>> You can write the details to a file using a Transformer itself. >>> >>> It is wise to stick to the public API as far as possible. We will >>> maintain back compat and your code will be usable w/ newer versions. >>> >>> >>> On Tue, Dec 2, 2008 at 5:12 PM, Marc Sturlese <[EMAIL PROTECTED]> >>> wrote: >>>> >>>> Thanks I really apreciate your help. >>>> >>>> I didn't explain myself so well in here: >>>> >>>>> 2.-This is probably my most difficult goal. >>>>> Deltaimport reads a timestamp from the dataimport.properties and >>>>> modify/add all documents from db wich were inserted after that date. >>>>> What I want is to be able to save in the field the id of the last >>>>> idexed doc. So in the next time I ejecute the indexer make it start >>>>> indexing from that last indexed id doc. >>>> You can use a Transformer to write something to the DB. >>>> Context#getDataSource(String) for each row >>>> >>>> When I said: >>>> >>>>> be able to save in the field the id of the last idexed doc >>>> I made a mistake, wanted to mean : >>>> >>>> be able to save in the file (dataimport.properties) the id of the last >>>> indexed doc. >>>> The point would be to do my own deltaquery indexing from the last doc >>>> indexed id instead of the timestamp. >>>> So I think this would not work in that case (it's my mistake because >>>> of the bad explanation): >>>> >>>>>You can use a Transformer to write something to the DB. >>>>>Context#getDataSource(String) for each row >>>> >>>> It is because I was saying: >>>>> I think I should begin modifying the SolrWriter.java and >>>>> DocBuilder.java. >>>>> Creating functions like getStartTime, persistStartTime... for ID >>>>> control >>>> >>>> I am in the correct direction? >>>> Sorry for my englis and thanks in advance >>>> >>>> >>>> Noble Paul നോബിള് नोब्ळ् wrote: >>>>> >>>>> On Tue, Dec 2, 2008 at 3:01 PM, Marc Sturlese >>>>> <[EMAIL PROTECTED]> >>>>> wrote: >>>>>> >>>>>> Hey there, >>>>>> >>>>>> I have my dataimporthanlder almost completely configured. I am >>>>>> missing three goals. I don't think I can reach them just via xml >>>>>> conf or transformer and sqlEntitProcessor plugin. But need to be >>>>>> sure of that. >>>>>> If there's no other way I will hack some solr source classes, would >>>>>> like to know the best way to do that. Once I have it solved, I can >>>>>> upload or post the source in the forum in case someone think it can >>>>>> be helpful. >>>>>> >>>>>> 1.- Every time I execute dataimporthandler (to index data from a >>>>>> db), at the start time or end time I need to delete some expired >>>>>> documents. I have to delete them from the database and from the >>>>>> index. I know wich documents must be deleted because of a field in >>>>>> the db that says it. Would not like to delete first all from DB or >>>>>> first all from index but one from index and one from doc every time. >>>>> >>>>> You can override the init() destroy() of the SqlEntityProcessor and >>>>> use it as the processor for the root entity. At this point you can >>>>> run the necessary db queries and solr delete queries . look at >>>>> Context#getSolrCore() and Context#getdataSource(String) >>>>> >>>>> >>>>>> The "delete mark" is setted as an update in the db row so I think I >>>>>> could use deltaImport. Don't know If deletedPkQuery is the way to do >>>>>> that. Can not find so much information about how to make it work. As >>>>>> deltaQuery modifies docs (delete old and insert new) I supose it >>>>>> must be a easy way to do this just doing the delete and not the new >>>>>> insert. >>>>> deletedPkQuery does everything first. it runs the query and uses that >>>>> to identify the deleted rows. >>>>>> >>>>>> 2.-This is probably my most difficult goal. >>>>>> Deltaimport reads a timestamp from the dataimport.properties and >>>>>> modify/add all documents from db wich were inserted after that date. >>>>>> What I want is to be able to save in the field the id of the last >>>>>> idexed doc. So in the next time I ejecute the indexer make it start >>>>>> indexing from that last indexed id doc. >>>>> You can use a Transformer to write something to the DB. >>>>> Context#getDataSource(String) for each row >>>>> >>>>>> The point of doing this is that if I do a full import from a db with >>>>>> lots of rows the app could encounter a problem in the middle of the >>>>>> execution and abort the process. As deltaquey works I would have to >>>>>> restart the execution from the begining. Having this new >>>>>> functionality I could optimize the index and start from the last >>>>>> indexed doc. >>>>>> I think I should begin modifying the SolrWriter.java and >>>>>> DocBuilder.java. >>>>>> Creating functions like getStartTime, persistStartTime... for ID >>>>>> control >>>>>> >>>>>> 3.-I commented before about this last point. I want to give boost to >>>>>> doc fields at indexing time. >>>>>>>>Adding fieldboost is a planned item. >>>>>> >>>>>>>>It must work as follows . >>>>>>>>Add a special value $fieldBoost.<fieldname> to the row map >>>>>> >>>>>>>>And DocBuilder should respect that. You can raise a bug and we can >>>>>>>>commit it soon. >>>>>> How can I do to rise a bug? >>>>> https://issues.apache.org/jira/secure/CreateIssue!default.jspa >>>>>> >>>>>> Thanks in advance >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and- >>>>>> db--lastIndexed-id-feature-tp20788755p20788755.html >>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> --Noble Paul >>>>> >>>>> >>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db >>>> --lastIndexed-id-feature-tp20788755p20790542.html >>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>> >>>> >>> >>> >>> >>> -- >>> --Noble Paul >>> >>> >> >> >> >> -- >> --Noble Paul >> >> > > -- > View this message in context: > http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20801932.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul