Hey,
Updating the sql query in index and indexAll seems to do the trick. I'm now
seeing the withdrawn items in the oai output and they are correctly marked
as deleted via header status="deleted".

Thanks,

-- 
Ondřej Košarko
Charles University in Prague
Faculty of Mathematics and Physics, UFAL


2014-05-03 6:45 GMT+02:00 Andrew Wong <[email protected]>:

>  Hi,
>
>
>
> Quoted:
>
> “It seems the withdrawn items are not even taken into account when you
> do bin/dspace oai import -c (judging by the reported Total).”
>
>
>
> -          Yes,  the withdrawn items are not retrieved during this
> process.  The following is my investigation.
>
>
>
> 1.
>
> When the record is withdrawn,  some fields of the correspondent row in
> “item” are updated:
>
>
>
> ‘in_archive’ set to be false
>
> ‘withdrawn’ set to be true
>
> ‘last_modified’ updated to latest timestamp
>
>
>
>
>
> 2.
>
> From Dspace source code:
>
> /dspace-oai/src/main/java/org/dspace/xoai/app/XOAI.java
>
>
>
> Locate the following line from function - private int indexAll() throws
> DSpaceSolrIndexerException
>
>
>
> String sqlQuery = "SELECT item_id FROM item WHERE in_archive=TRUE";
>
>
>
> This query will retrieve all item_ids that will import to Solr OAI index.
>
> This explain why the withdrawn items cannot be import to Solr index
> because “in_archive” of the item has already set to be false during
> withdrawn process,
>
>
>
> So the withdrawn items should be retrieved after changing the SQL query to
>
>
>
> String sqlQuery = "SELECT item_id FROM item WHERE in_archive=TRUE OR
> withdrawn=TRUE";
>
>
>
> However, I have not checked comprehensively that the change will affect
> other area.
>
>
>
> --  Andrew Wong
>
> Systems Librarian,
>
> The Hong Kong University of Science and Technology Library
>
>
>
>
>
>
>
>
>
>
>
> *From:* Ondřej Košarko [mailto:[email protected]]
> *Sent:* Saturday, March 15, 2014 12:09 AM
> *To:* [email protected]
> *Subject:* Re: [Dspace-tech] OAI and withdrawn items
>
>
>
> And one more idea regarding deletions and OAI caching.
>
>
>
> When incremental harvesting is running against a repository using the
> "from" and "until" parameters. What if the harvester gets cached result of
> item that was deleted in meantime? Will the harvester ever find out the
> item was deleted?
>
>
>
> Please correct me if I'm not grasping the selective harvesting correctly
> but the cache seems like a bad idea when there are times involved in the
> requests.
>
>
>
> Regards,
>
> OK
>
>
>
> 2014-03-14 16:51 GMT+01:00 Ondřej Košarko <[email protected]>:
>
> Hi,
>
> it seems the OAI-PMH in DSpace 4.1 doesn't display deleted items correctly
> (or at all).
>
>
>
> The repository is set to persistently keep track of deleted records by
> default. That means even the withdrawn items should appear in some of the
> listings (identifiers/records). They should have header with status deleted.
>
>
>
> I'm not able to obtain the deleted records even with GetRecord[1] (all I'm
> seeing is '<error code="idDoesNotExist">The given id does not
> exist</error>').
>
>
>
> It seems the withdrawn items are not even taken into account when you
> do bin/dspace oai import -c (judging by the reported Total).
>
>
>
> Best regards,
>
> OK
>
>
>
>
>
> [1]
> http://www.openarchives.org/OAI/openarchivesprotocol.html#DeletedRecords
>
>
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos.  Get
> unparalleled scalability from the best Selenium testing platform available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to