[basex-talk] OAI-PMH
Hello all I am looking into the possibility of using BaseX as an OAI-PMH metadata provider and harvester, and wondered if anyone has experience with it for this purpose. Specifically using BaseX as a repository with the http-service with xquery scripts for accessing and providing metadata records. Presumably, there aren't any limitations on the database side, and since the OAI-PMH protocol is all XML (http://www.openarchives.org/pmh/) it seems like a good idea to try and make it work. So if people on this list have any experience, I would like to hear from you. thanks, Lars G Johnsen National Library of Norway ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] Implementation of fn:collection
Hey Christian, Thanks for the response. fn:collection is handling as specified in the BaseX documentation, so I guess my question changes into: Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)? Thanks, Jeremy On Thu, Oct 17, 2013 at 1:44 AM, Christian Grün christian.gr...@gmail.comwrote: Hi Jeremy, it’s somewhat surprising, but XQuery itself has no database semantics. This is the reason why the implementation of fn:collection is pretty much implementation-defined [1]. In BaseX, the function checks if the specified URI matches a database. If negative, the URI will be resolved against the file system. I’m not sure why it didn’t turn out to do so in your scenario, but I’m pretty sure you’ll find some answers in our Wiki article on BaseX databases [2]. If not, feel free to give us some more feedback. Christian [1] http://www.w3.org/TR/xpath-functions/#func-collection [2] http://docs.basex.org/wiki/Databases ___ Hi Guys, I was trying to use the fn:collectino function today, but I am having trouble understanding the implementation. From past experience, fn:collection($uri) (where $uri points to a document with a list of docs within it) has returned a sequence of all the documents. It appears that BaseX does not implement it this way. I have a database where I am storing a large number of documents, and I would like to have several collections inside which are subsets of these documents. I need to open these subsets quickly, and I thought fn:collection would allow me to do this. Is there any way to easily open these subsets using fn:collection (or a similar high performance function)? As of right now I am solving the problem by using a for loop to traverse the documents, but this is not fast enough for my needs. Thanks, Jeremy ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] Implementation of fn:collection
Hi Jeremy, Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)? Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter: collection(db)[position() = 1 to 10] Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference. Hope this helps, Christian ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] Implementation of fn:collection
Yes, sorry I should have specified the criteria. I have a list of a subset of the documents in the database that need to be opened (I can store this list in any form necessary), but I am experiencing performance problems since I need to iterate over the list in order to filter or choose which documents to open. Thanks, Jeremy On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün christian.gr...@gmail.comwrote: Hi Jeremy, Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)? Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter: collection(db)[position() = 1 to 10] Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference. Hope this helps, Christian ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] Implementation of fn:collection
Hi Jeremy, if the list is more or less arbitrary, then you’ll indeed have to browse all your documents in order to find the ones that are relevant you. One approach could be to specify a filtering predicate: let $paths := (a.xml, b.xml) return db:open(db)[db:path(.) = $paths] If this is too slow, string comparisons can be sped up by using a map, as recently proposed on this list: let $paths := (a.xml, b.xml) let $map := map:new( $paths ! { . : true() }) return db:open(db)[$map(db:path(.))] How many documents are stored in your database? Best, Christian ___ Yes, sorry I should have specified the criteria. I have a list of a subset of the documents in the database that need to be opened (I can store this list in any form necessary), but I am experiencing performance problems since I need to iterate over the list in order to filter or choose which documents to open. Thanks, Jeremy On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün christian.gr...@gmail.com wrote: Hi Jeremy, Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)? Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter: collection(db)[position() = 1 to 10] Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference. Hope this helps, Christian ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] Implementation of fn:collection
I am dealing with collections of documents in the 2500+ range. The code you suggested is similar to what I have already, but much cleaner. I'm assuming the lookup within the map is done in constant time? Cheers, Jeremy On Thu, Oct 17, 2013 at 10:41 AM, Christian Grün christian.gr...@gmail.comwrote: Hi Jeremy, if the list is more or less arbitrary, then you’ll indeed have to browse all your documents in order to find the ones that are relevant you. One approach could be to specify a filtering predicate: let $paths := (a.xml, b.xml) return db:open(db)[db:path(.) = $paths] If this is too slow, string comparisons can be sped up by using a map, as recently proposed on this list: let $paths := (a.xml, b.xml) let $map := map:new( $paths ! { . : true() }) return db:open(db)[$map(db:path(.))] How many documents are stored in your database? Best, Christian ___ Yes, sorry I should have specified the criteria. I have a list of a subset of the documents in the database that need to be opened (I can store this list in any form necessary), but I am experiencing performance problems since I need to iterate over the list in order to filter or choose which documents to open. Thanks, Jeremy On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün christian.gr...@gmail.com wrote: Hi Jeremy, Is there a method to open a subset of documents (not distinguishable by path) in the database with performance similar to calling db:open($db_name)? Is there any criteria regarding the documents you want to open? If you simply want to choose the first 10 documents, you could try a position filter: collection(db)[position() = 1 to 10] Talking about performance: fn:collection and db:open are based on the same code, so there shouldn’t be any difference. Hope this helps, Christian ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk