> Anyway, I am happy now :-) > Thanks everybody!! great!
> > Have a good weekend, Done... Regards Ard > > Reinier > > > Ard Schrijvers wrote: >> >> On Fri, Jun 12, 2009 at 3:16 PM, Reinier van den >> Born<[email protected]> wrote: >>> >>> Hi Ard, >>> >>> To make sure I understand this correctly. I should do: >>> >>> delete; wait; write. >>> >>> Currently the cycle time is 5 seconds, which would make it a very slow >>> process. >>> Alternatively I could delete all first and then write, >> >> that is what i meant >> >>> but that would mean that all content would be gone for 5 seconds. >> >> Yes, true >> >>> Makes it worthwhile to find a way to write only documents that have been >>> modified. >> >> As i said, the problem does not occur when you just change an existing >> document: why delete it and add it again? Just override its contents, >> it is faster, and does not have the issue. > > That's what I did in the beginning. It does have the issue. > >>> So how does a tool like Dav2Disk handle this? >> >> I don't know the tool. AFAIK, a tool like that is meant for initial >> importing, not primarily meant for an in production repository >> >>> That is certainly not waiting 5 seconds for each file to write. >>> Nor is it deleting everything first before it writes, or? >>> Or does it suffer from the same problem and I just never noticed it? >> >> As explained i don't know the tool. But, here my suggestion: >> >> You shouldn't delete and add documents that haven't been changed: it >> doesn't make sense >> >> Howto avoid: >> >> 1) compute simple md5 or some other hash of the documents text before >> putting it in the repository >> 2) store the md5 as a property >> 3) before deleting / adding a document, compute md5 and check if it >> exists in the repository (simple search) >> 4) modify changed documents instead of delete/add cycle >> >> I am confident this does solve your issue. You can test it first if >> you want with the 5 sec delay to be sure >> >> Regards Ard >> >>> Reinier >>> >>> Ard Schrijvers wrote: >>>> >>>> Hello Reinier, >>>> >>>> On Fri, Jun 12, 2009 at 1:59 PM, Reinier van den >>>> Born<[email protected]> wrote: >>>>> >>>>> Bart, >>>>> >>>>> The version of the repository is 1.2.15.1. >>>>> >>>>> Btw. I tried deleting before writing. It doesn't make a difference. >>>> >>>> This is a known issue, not easy to solve. You have two possible >>>> solutions: >>>> >>>> 1) instead of a deletion / add cycle you modify an existing document >>>> 2) to the deletion of the old ones in a seperate cycle, with at least >>>> a delay of X seconds, where X is the value in your cron configuraiton >>>> of the indexer.xml >>>> >>>> I hope this isn't to much of a problem for you. At least, you can >>>> check whether my proposed solution works >>>> >>>> Regards Ard >>>> >>>>> Reinier >>>>> >>>>> >>>>> Bart van der Schans wrote: >>>>>> >>>>>> Reinier, >>>>>> >>>>>> Which version of the repository are you using? >>>>>> >>>>>> Bart >>>>>> >>>>>> On Fri, Jun 12, 2009 at 1:14 PM, Reinier van den Born >>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Hi Jasha, >>>>>>> >>>>>>> Rebuilding the index fixed the problem of results not showing up. >>>>>>> Problem remains that if content is written twice it shows up twice. >>>>>>> >>>>>>> Maybe I should delete the existing document before I write it? >>>>>>> (at the moment I simply overwrite...) >>>>>>> >>>>>>> Reinier >>>>>>> >>>>>>> >>>>>>> Jasha Joachimsthal wrote: >>>>>>>> >>>>>>>> Hi Reinier, >>>>>>>> >>>>>>>> this looks like your Lucene index contains some errors if some >>>>>>>> results >>>>>>>> appear twice and others don't appear at all. Try rebuilding the >>>>>>>> index. >>>>>>>> >>>>>>>> Jasha Joachimsthal >>>>>>>> >>>>>>>> [email protected] - [email protected] >>>>>>>> >>>>>>>> www.onehippo.com >>>>>>>> Amsterdam - Hippo B.V. Oosteinde 11 1017 WT Amsterdam >>>>>>>> +31(0)20-5224466 >>>>>>>> San Francisco - Hippo USA Inc. 185 H Street, suite B, Petaluma CA >>>>>>>> 94952 +1 (707) 7734646 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2009/6/11 Reinier van den Born <[email protected]>: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I try to automatically update a collection of documents in a Hippo >>>>>>>>> repository. >>>>>>>>> Each document is kept in its own collection within a "main" >>>>>>>>> collection: >>>>>>>>> ../1/a.xml, ../2/b.xml, etc. >>>>>>>>> Each update is independent of earlier ones: I don't need caching, >>>>>>>>> no >>>>>>>>> JMS, >>>>>>>>> or >>>>>>>>> what more. >>>>>>>>> >>>>>>>>> So I do a simple scan for old documents (fetchCollection), upload >>>>>>>>> the >>>>>>>>> new >>>>>>>>> and delete the old. >>>>>>>>> Very simple, so I was thinking I could use the Java Adapter >>>>>>>>> directly... >>>>>>>>> >>>>>>>>> Which works except for the getting the scan. Its function is >>>>>>>>> similar >>>>>>>>> to >>>>>>>>> "ls >>>>>>>>> .../*/*.xml". >>>>>>>>> But my code+DASL gives me a weird response: >>>>>>>>> - only documents show up that have recently be touched by the CMS >>>>>>>>> (clicked >>>>>>>>> on, not necessarily opened) >>>>>>>>> - the documents I write appear repeated in the list (=duplicates, >>>>>>>>> each >>>>>>>>> write >>>>>>>>> cycle one occurrence is added) >>>>>>>>> - this duplication is reset when I change the DASL query (eg depth >>>>>>>>> to >>>>>>>>> 1, >>>>>>>>> returns no documents, and back to 2). >>>>>>>>> - all documents are listed correctly by CMS and DAVexplorer, no >>>>>>>>> problemo. >>>>>>>>> >>>>>>>>> I use my own plain WebdavServiceImpl, which I assume does no >>>>>>>>> caching. >>>>>>>>> Also when I restart my app (tomcat) nothing changes, nor when I >>>>>>>>> restart >>>>>>>>> the >>>>>>>>> repo. >>>>>>>>> >>>>>>>>> Anyway, any help is appreciated? See code below. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Reinier >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------ >>>>>>>>> Here the code I use: >>>>>>>>> >>>>>>>>> ..... >>>>>>>>> public void hippoInit (Properties props) { >>>>>>>>> try { >>>>>>>>> WebdavConfig webdavConfig = new WebdavConfig(props); >>>>>>>>> webdavService = new WebdavServiceImpl(webdavConfig); >>>>>>>>> rootPath = webdavService.getBasePath(); >>>>>>>>> } >>>>>>>>> catch (Exception e) { >>>>>>>>> error( "Error initializing Hippo repository connection: >>>>>>>>> "+e.getMessage()); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> public HashMap hippoScanJobOpenings (String relPath) { >>>>>>>>> HashMap jobs = new HashMap(); >>>>>>>>> jobs.put( "REPO.RELPATH", relPath ); >>>>>>>>> >>>>>>>>> String query = Interpolation.interpolate( jobsQuery, jobs ); >>>>>>>>> try { >>>>>>>>> DocumentCollection coll = webdavService.fetchCollection( >>>>>>>>> rootPath, >>>>>>>>> query, false ); >>>>>>>>> List docs = coll.getDocuments(); >>>>>>>>> >>>>>>>>> Iterator iter = docs.iterator(); >>>>>>>>> while (iter.hasNext()) { >>>>>>>>> Document collDoc = (Document) iter.next(); >>>>>>>>> String dirPath = ((DocumentPath) >>>>>>>>> collDoc.getPath()).getRelativePath(); >>>>>>>>> message( "Found job: "+dirPath ); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> catch (Exception e) { >>>>>>>>> error( "Error getting existing job openings: "+e.getMessage()); >>>>>>>>> } >>>>>>>>> return jobs; >>>>>>>>> } >>>>>>>>> >>>>>>>>> The DASL query used is: >>>>>>>>> >>>>>>>>> <d:searchrequest xmlns:d="DAV:" >>>>>>>>> xmlns:S="http://jakarta.apache.org/slide/" >>>>>>>>> xmlns:h="http://hippo.nl/cms/1.0"> >>>>>>>>> <d:basicsearch> >>>>>>>>> <d:select> >>>>>>>>> <d:prop> >>>>>>>>> <h:caption/> >>>>>>>>> <d:displayname/> >>>>>>>>> <h:type/> >>>>>>>>> <d:modificationdate/> >>>>>>>>> </d:prop> >>>>>>>>> </d:select> >>>>>>>>> <d:from> >>>>>>>>> <d:scope> >>>>>>>>> <d:href>${REPO.RELPATH}</d:href> >>>>>>>>> <d:depth>2</d:depth> >>>>>>>>> </d:scope> >>>>>>>>> </d:from> >>>>>>>>> <d:where> >>>>>>>>> <d:eq> >>>>>>>>> <d:prop><h:type/></d:prop> >>>>>>>>> <d:literal>jobopening</d:literal> >>>>>>>>> </d:eq> >>>>>>>>> </d:where> >>>>>>>>> <d:orderby> >>>>>>>>> <d:order> >>>>>>>>> <d:prop><h:modificationDate/></d:prop> >>>>>>>>> <d:ascending/> >>>>>>>>> </d:order> >>>>>>>>> </d:orderby> >>>>>>>>> </d:basicsearch> >>>>>>>>> </d:searchrequest> >>>>>>>>> >>>>>>>>> Notes: >>>>>>>>> >>>>>>>>> - props contains the settings to initialise the WebdavConfig object >>>>>>>>> as >>>>>>>>> described in ... >>>>>>>>> - relPath is the path from rootPath to the collection containing >>>>>>>>> the >>>>>>>>> documents. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Reinier van den Born >>>>>>>>> HintTech B.V. >>>>>>>>> >>>>>>>>> T: +31(0)88 268 25 00 >>>>>>>>> F: +31(0)88 268 25 01 >>>>>>>>> M: +31(0)6 494 171 36 >>>>>>>>> >>>>>>>>> Delftechpark 37i | 2628 XJ Delft | The Netherlands >>>>>>>>> www.hinttech.com >>>>>>>>> >>>>>>>>> HintTech is a specialist in eBusiness Technology ( .Net, Java >>>>>>>>> platform, >>>>>>>>> Tridion ) and IT-Projects. >>>>>>>>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. >>>>>>>>> NL8062.16.396.B01 >>>>>>>>> >>>>>>>>> >>>>>>>>> ******************************************** >>>>>>>>> Hippocms-dev: Hippo CMS development public mailinglist >>>>>>>>> >>>>>>>>> Searchable archives can be found at: >>>>>>>>> MarkMail: http://hippocms-dev.markmail.org >>>>>>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> ******************************************** >>>>>>>> Hippocms-dev: Hippo CMS development public mailinglist >>>>>>>> >>>>>>>> Searchable archives can be found at: >>>>>>>> MarkMail: http://hippocms-dev.markmail.org >>>>>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >>>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Reinier van den Born >>>>>>> HintTech B.V. >>>>>>> >>>>>>> T: +31(0)88 268 25 00 >>>>>>> F: +31(0)88 268 25 01 >>>>>>> M: +31(0)6 494 171 36 >>>>>>> >>>>>>> Delftechpark 37i | 2628 XJ Delft | The Netherlands >>>>>>> www.hinttech.com >>>>>>> >>>>>>> HintTech is a specialist in eBusiness Technology ( .Net, Java >>>>>>> platform, >>>>>>> Tridion ) and IT-Projects. >>>>>>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. >>>>>>> NL8062.16.396.B01 >>>>>>> >>>>>>> ******************************************** >>>>>>> Hippocms-dev: Hippo CMS development public mailinglist >>>>>>> >>>>>>> Searchable archives can be found at: >>>>>>> MarkMail: http://hippocms-dev.markmail.org >>>>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> -- >>>>> >>>>> Reinier van den Born >>>>> HintTech B.V. >>>>> >>>>> T: +31(0)88 268 25 00 >>>>> F: +31(0)88 268 25 01 >>>>> M: +31(0)6 494 171 36 >>>>> >>>>> Delftechpark 37i | 2628 XJ Delft | The Netherlands >>>>> www.hinttech.com >>>>> >>>>> HintTech is a specialist in eBusiness Technology ( .Net, Java platform, >>>>> Tridion ) and IT-Projects. >>>>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. >>>>> NL8062.16.396.B01 >>>>> >>>>> ******************************************** >>>>> Hippocms-dev: Hippo CMS development public mailinglist >>>>> >>>>> Searchable archives can be found at: >>>>> MarkMail: http://hippocms-dev.markmail.org >>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >>>>> >>>>> >>>>> >>>> ******************************************** >>>> Hippocms-dev: Hippo CMS development public mailinglist >>>> >>>> Searchable archives can be found at: >>>> MarkMail: http://hippocms-dev.markmail.org >>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >>>> >>> -- >>> >>> Reinier van den Born >>> HintTech B.V. >>> >>> T: +31(0)88 268 25 00 >>> F: +31(0)88 268 25 01 >>> M: +31(0)6 494 171 36 >>> >>> Delftechpark 37i | 2628 XJ Delft | The Netherlands >>> www.hinttech.com >>> >>> HintTech is a specialist in eBusiness Technology ( .Net, Java platform, >>> Tridion ) and IT-Projects. >>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. >>> NL8062.16.396.B01 >>> >>> ******************************************** >>> Hippocms-dev: Hippo CMS development public mailinglist >>> >>> Searchable archives can be found at: >>> MarkMail: http://hippocms-dev.markmail.org >>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >>> >>> >>> >> ******************************************** >> Hippocms-dev: Hippo CMS development public mailinglist >> >> Searchable archives can be found at: >> MarkMail: http://hippocms-dev.markmail.org >> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >> > > -- > > Reinier van den Born > HintTech B.V. > > T: +31(0)88 268 25 00 > F: +31(0)88 268 25 01 > M: +31(0)6 494 171 36 > > Delftechpark 37i | 2628 XJ Delft | The Netherlands > www.hinttech.com > > HintTech is a specialist in eBusiness Technology ( .Net, Java platform, > Tridion ) and IT-Projects. > Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. NL8062.16.396.B01 > > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
