> Anyway, I am happy now :-)
> Thanks everybody!!

great!

>
> Have a good weekend,

Done...

Regards Ard

>
> Reinier
>
>
> Ard Schrijvers wrote:
>>
>> On Fri, Jun 12, 2009 at 3:16 PM, Reinier van den
>> Born<[email protected]> wrote:
>>>
>>> Hi Ard,
>>>
>>> To make sure I understand this correctly. I should do:
>>>
>>>  delete; wait; write.
>>>
>>> Currently the cycle time is 5 seconds, which would make it a very slow
>>> process.
>>> Alternatively I could delete all first and then write,
>>
>> that is what i meant
>>
>>> but that would mean that all content would be gone for 5 seconds.
>>
>> Yes, true
>>
>>> Makes it worthwhile to find a way to write only documents that have been
>>> modified.
>>
>> As i said, the problem does not occur when you just change an existing
>> document: why delete it and add it again? Just override its contents,
>> it is faster, and does not have the issue.
>
> That's what I did in the beginning. It does have the issue.
>
>>> So how does a tool like Dav2Disk handle this?
>>
>> I don't know the tool. AFAIK, a tool like that is meant for initial
>> importing, not primarily meant for an in production repository
>>
>>> That is certainly not waiting 5 seconds for each file to write.
>>> Nor is it deleting everything first before it writes, or?
>>> Or does it suffer from the same problem and I just never noticed it?
>>
>> As explained i don't know the tool. But, here my suggestion:
>>
>> You shouldn't delete and add documents that haven't been changed: it
>> doesn't make sense
>>
>> Howto avoid:
>>
>> 1) compute simple md5 or some other hash of the documents text before
>> putting it in the repository
>> 2) store the md5 as a property
>> 3) before deleting / adding a document, compute md5 and check if it
>> exists in the repository (simple search)
>> 4) modify changed documents instead of delete/add cycle
>>
>> I am confident this does solve your issue. You can test it first if
>> you want with the 5 sec delay to be sure
>>
>> Regards Ard
>>
>>> Reinier
>>>
>>> Ard Schrijvers wrote:
>>>>
>>>> Hello Reinier,
>>>>
>>>> On Fri, Jun 12, 2009 at 1:59 PM, Reinier van den
>>>> Born<[email protected]> wrote:
>>>>>
>>>>> Bart,
>>>>>
>>>>> The version of the repository is 1.2.15.1.
>>>>>
>>>>> Btw. I tried deleting before writing. It doesn't make a difference.
>>>>
>>>> This is a known issue, not easy to solve. You have two possible
>>>> solutions:
>>>>
>>>> 1) instead of a deletion / add cycle you modify an existing document
>>>> 2) to the deletion of the old ones in a seperate cycle, with at least
>>>> a delay of X seconds, where X is the value in your cron configuraiton
>>>> of the indexer.xml
>>>>
>>>> I hope this isn't to much of a problem for you. At least, you can
>>>> check whether my proposed solution works
>>>>
>>>> Regards Ard
>>>>
>>>>> Reinier
>>>>>
>>>>>
>>>>> Bart van der Schans wrote:
>>>>>>
>>>>>> Reinier,
>>>>>>
>>>>>> Which version of the repository are you using?
>>>>>>
>>>>>> Bart
>>>>>>
>>>>>> On Fri, Jun 12, 2009 at 1:14 PM, Reinier van den Born
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi Jasha,
>>>>>>>
>>>>>>> Rebuilding the index fixed the problem of results not showing up.
>>>>>>> Problem remains that if content is written twice it shows up twice.
>>>>>>>
>>>>>>> Maybe I should delete the existing document before I write it?
>>>>>>> (at the moment I simply overwrite...)
>>>>>>>
>>>>>>> Reinier
>>>>>>>
>>>>>>>
>>>>>>> Jasha Joachimsthal wrote:
>>>>>>>>
>>>>>>>> Hi Reinier,
>>>>>>>>
>>>>>>>> this looks like your Lucene index contains some errors if some
>>>>>>>> results
>>>>>>>> appear twice and others don't appear at all. Try rebuilding the
>>>>>>>> index.
>>>>>>>>
>>>>>>>> Jasha Joachimsthal
>>>>>>>>
>>>>>>>> [email protected] - [email protected]
>>>>>>>>
>>>>>>>> www.onehippo.com
>>>>>>>> Amsterdam - Hippo B.V. Oosteinde 11 1017 WT Amsterdam
>>>>>>>> +31(0)20-5224466
>>>>>>>> San Francisco - Hippo USA Inc. 185 H Street, suite B, Petaluma CA
>>>>>>>> 94952 +1 (707) 7734646
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2009/6/11 Reinier van den Born <[email protected]>:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I try to automatically update a collection of documents in a Hippo
>>>>>>>>> repository.
>>>>>>>>> Each document is kept in its own collection within a "main"
>>>>>>>>> collection:
>>>>>>>>> ../1/a.xml, ../2/b.xml, etc.
>>>>>>>>> Each update is independent of earlier ones: I don't need caching,
>>>>>>>>> no
>>>>>>>>> JMS,
>>>>>>>>> or
>>>>>>>>> what more.
>>>>>>>>>
>>>>>>>>> So I do a simple scan for old documents (fetchCollection), upload
>>>>>>>>> the
>>>>>>>>> new
>>>>>>>>> and delete the old.
>>>>>>>>> Very simple, so I was thinking I could use the Java Adapter
>>>>>>>>> directly...
>>>>>>>>>
>>>>>>>>> Which works except for the getting the scan. Its function is
>>>>>>>>> similar
>>>>>>>>> to
>>>>>>>>> "ls
>>>>>>>>> .../*/*.xml".
>>>>>>>>> But my code+DASL gives me a weird response:
>>>>>>>>> - only documents show up that have recently be touched by the CMS
>>>>>>>>> (clicked
>>>>>>>>> on, not necessarily opened)
>>>>>>>>> - the documents I write appear repeated in the list (=duplicates,
>>>>>>>>> each
>>>>>>>>> write
>>>>>>>>> cycle one occurrence is added)
>>>>>>>>> - this duplication is reset when I change the DASL query (eg depth
>>>>>>>>> to
>>>>>>>>> 1,
>>>>>>>>> returns no documents, and back to 2).
>>>>>>>>> - all documents are listed correctly by CMS and DAVexplorer, no
>>>>>>>>> problemo.
>>>>>>>>>
>>>>>>>>> I use my own plain WebdavServiceImpl, which I assume does no
>>>>>>>>> caching.
>>>>>>>>> Also when I restart my app (tomcat) nothing changes, nor when I
>>>>>>>>> restart
>>>>>>>>> the
>>>>>>>>> repo.
>>>>>>>>>
>>>>>>>>> Anyway, any help is appreciated? See code below.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Reinier
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------
>>>>>>>>> Here the code I use:
>>>>>>>>>
>>>>>>>>> .....
>>>>>>>>> public void hippoInit (Properties props) {
>>>>>>>>>  try {
>>>>>>>>>    WebdavConfig webdavConfig = new WebdavConfig(props);
>>>>>>>>>    webdavService = new WebdavServiceImpl(webdavConfig);
>>>>>>>>>    rootPath      = webdavService.getBasePath();
>>>>>>>>>  }
>>>>>>>>>  catch (Exception e) {
>>>>>>>>>    error( "Error initializing Hippo repository connection:
>>>>>>>>> "+e.getMessage());
>>>>>>>>>  }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> public HashMap hippoScanJobOpenings (String relPath) {
>>>>>>>>>  HashMap jobs = new HashMap();
>>>>>>>>>  jobs.put( "REPO.RELPATH", relPath );
>>>>>>>>>
>>>>>>>>>  String query = Interpolation.interpolate( jobsQuery, jobs );
>>>>>>>>>  try {
>>>>>>>>>    DocumentCollection coll = webdavService.fetchCollection(
>>>>>>>>> rootPath,
>>>>>>>>> query, false );
>>>>>>>>>    List docs = coll.getDocuments();
>>>>>>>>>
>>>>>>>>>    Iterator iter = docs.iterator();
>>>>>>>>>    while (iter.hasNext()) {
>>>>>>>>>        Document collDoc = (Document) iter.next();
>>>>>>>>>        String   dirPath = ((DocumentPath)
>>>>>>>>> collDoc.getPath()).getRelativePath();
>>>>>>>>>        message( "Found job: "+dirPath );
>>>>>>>>>    }
>>>>>>>>>  }
>>>>>>>>>  catch (Exception e) {
>>>>>>>>>    error( "Error getting existing job openings: "+e.getMessage());
>>>>>>>>>  }
>>>>>>>>>  return jobs;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> The DASL query used is:
>>>>>>>>>
>>>>>>>>> <d:searchrequest xmlns:d="DAV:"
>>>>>>>>> xmlns:S="http://jakarta.apache.org/slide/";
>>>>>>>>> xmlns:h="http://hippo.nl/cms/1.0";>
>>>>>>>>>  <d:basicsearch>
>>>>>>>>>  <d:select>
>>>>>>>>>  <d:prop>
>>>>>>>>>    <h:caption/>
>>>>>>>>>    <d:displayname/>
>>>>>>>>>    <h:type/>
>>>>>>>>>    <d:modificationdate/>
>>>>>>>>>  </d:prop>
>>>>>>>>>  </d:select>
>>>>>>>>>  <d:from>
>>>>>>>>>  <d:scope>
>>>>>>>>>    <d:href>${REPO.RELPATH}</d:href>
>>>>>>>>>    <d:depth>2</d:depth>
>>>>>>>>>  </d:scope>
>>>>>>>>>  </d:from>
>>>>>>>>>  <d:where>
>>>>>>>>>  <d:eq>
>>>>>>>>>    <d:prop><h:type/></d:prop>
>>>>>>>>>    <d:literal>jobopening</d:literal>
>>>>>>>>>  </d:eq>
>>>>>>>>>  </d:where>
>>>>>>>>>  <d:orderby>
>>>>>>>>>  <d:order>
>>>>>>>>>    <d:prop><h:modificationDate/></d:prop>
>>>>>>>>>    <d:ascending/>
>>>>>>>>>  </d:order>
>>>>>>>>>  </d:orderby>
>>>>>>>>>  </d:basicsearch>
>>>>>>>>> </d:searchrequest>
>>>>>>>>>
>>>>>>>>> Notes:
>>>>>>>>>
>>>>>>>>> - props contains the settings to initialise the WebdavConfig object
>>>>>>>>> as
>>>>>>>>> described in ...
>>>>>>>>> - relPath is the path from rootPath to the collection containing
>>>>>>>>> the
>>>>>>>>> documents.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Reinier van den Born
>>>>>>>>> HintTech B.V.
>>>>>>>>>
>>>>>>>>> T: +31(0)88 268 25 00
>>>>>>>>> F: +31(0)88 268 25 01
>>>>>>>>> M: +31(0)6 494 171 36
>>>>>>>>>
>>>>>>>>> Delftechpark 37i | 2628 XJ Delft | The Netherlands
>>>>>>>>> www.hinttech.com
>>>>>>>>>
>>>>>>>>> HintTech is a specialist in eBusiness Technology ( .Net, Java
>>>>>>>>> platform,
>>>>>>>>> Tridion ) and IT-Projects.
>>>>>>>>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr.
>>>>>>>>> NL8062.16.396.B01
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ********************************************
>>>>>>>>> Hippocms-dev: Hippo CMS development public mailinglist
>>>>>>>>>
>>>>>>>>> Searchable archives can be found at:
>>>>>>>>> MarkMail: http://hippocms-dev.markmail.org
>>>>>>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> ********************************************
>>>>>>>> Hippocms-dev: Hippo CMS development public mailinglist
>>>>>>>>
>>>>>>>> Searchable archives can be found at:
>>>>>>>> MarkMail: http://hippocms-dev.markmail.org
>>>>>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Reinier van den Born
>>>>>>> HintTech B.V.
>>>>>>>
>>>>>>> T: +31(0)88 268 25 00
>>>>>>> F: +31(0)88 268 25 01
>>>>>>> M: +31(0)6 494 171 36
>>>>>>>
>>>>>>> Delftechpark 37i | 2628 XJ Delft | The Netherlands
>>>>>>> www.hinttech.com
>>>>>>>
>>>>>>> HintTech is a specialist in eBusiness Technology ( .Net, Java
>>>>>>> platform,
>>>>>>> Tridion ) and IT-Projects.
>>>>>>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr.
>>>>>>> NL8062.16.396.B01
>>>>>>>
>>>>>>> ********************************************
>>>>>>> Hippocms-dev: Hippo CMS development public mailinglist
>>>>>>>
>>>>>>> Searchable archives can be found at:
>>>>>>> MarkMail: http://hippocms-dev.markmail.org
>>>>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>> Reinier van den Born
>>>>> HintTech B.V.
>>>>>
>>>>> T: +31(0)88 268 25 00
>>>>> F: +31(0)88 268 25 01
>>>>> M: +31(0)6 494 171 36
>>>>>
>>>>> Delftechpark 37i | 2628 XJ Delft | The Netherlands
>>>>> www.hinttech.com
>>>>>
>>>>> HintTech is a specialist in eBusiness Technology ( .Net, Java platform,
>>>>> Tridion ) and IT-Projects.
>>>>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr.
>>>>> NL8062.16.396.B01
>>>>>
>>>>> ********************************************
>>>>> Hippocms-dev: Hippo CMS development public mailinglist
>>>>>
>>>>> Searchable archives can be found at:
>>>>> MarkMail: http://hippocms-dev.markmail.org
>>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>>>>
>>>>>
>>>>>
>>>> ********************************************
>>>> Hippocms-dev: Hippo CMS development public mailinglist
>>>>
>>>> Searchable archives can be found at:
>>>> MarkMail: http://hippocms-dev.markmail.org
>>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>>>
>>> --
>>>
>>> Reinier van den Born
>>> HintTech B.V.
>>>
>>> T: +31(0)88 268 25 00
>>> F: +31(0)88 268 25 01
>>> M: +31(0)6 494 171 36
>>>
>>> Delftechpark 37i | 2628 XJ Delft | The Netherlands
>>> www.hinttech.com
>>>
>>> HintTech is a specialist in eBusiness Technology ( .Net, Java platform,
>>> Tridion ) and IT-Projects.
>>> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr.
>>> NL8062.16.396.B01
>>>
>>> ********************************************
>>> Hippocms-dev: Hippo CMS development public mailinglist
>>>
>>> Searchable archives can be found at:
>>> MarkMail: http://hippocms-dev.markmail.org
>>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>>
>>>
>>>
>> ********************************************
>> Hippocms-dev: Hippo CMS development public mailinglist
>>
>> Searchable archives can be found at:
>> MarkMail: http://hippocms-dev.markmail.org
>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>
>
> --
>
> Reinier van den Born
> HintTech B.V.
>
> T: +31(0)88 268 25 00
> F: +31(0)88 268 25 01
> M: +31(0)6 494 171 36
>
> Delftechpark 37i | 2628 XJ Delft | The Netherlands
> www.hinttech.com
>
> HintTech is a specialist in eBusiness Technology ( .Net, Java platform,
> Tridion ) and IT-Projects.
> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. NL8062.16.396.B01
>
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
>
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>
>
>
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to