[basex-talk] OAI-PMH

2013-10-17 Thread Lars Johnsen
Hello all

I am looking into the possibility of using BaseX as an OAI-PMH metadata
provider and harvester, and wondered if anyone has experience with it for
this purpose. Specifically using BaseX as a repository with the
http-service with xquery scripts for accessing and providing metadata
records.

Presumably, there aren't any limitations on the database side, and since
the OAI-PMH protocol is all XML (http://www.openarchives.org/pmh/) it seems
like a good idea to try and make it work. So if people on this list have
any experience, I would like to hear from you.

thanks,

Lars G Johnsen
National Library of Norway
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Implementation of fn:collection

2013-10-17 Thread Jeremy Moseley
Hey Christian,

Thanks for the response. fn:collection is handling as specified in the
BaseX documentation, so I guess my question changes into: Is there a method
to open a subset of documents (not distinguishable by path) in the database
with performance similar to calling db:open($db_name)?

Thanks,

Jeremy


On Thu, Oct 17, 2013 at 1:44 AM, Christian Grün
christian.gr...@gmail.comwrote:

 Hi Jeremy,

 it’s somewhat surprising, but XQuery itself has no database semantics.
 This is the reason why the implementation of fn:collection is pretty
 much implementation-defined [1]. In BaseX, the function checks if the
 specified URI matches a database. If negative, the URI will be
 resolved against the file system. I’m not sure why it didn’t turn out
 to do so in your scenario, but I’m pretty sure you’ll find some
 answers in our Wiki article on BaseX databases [2]. If not, feel free
 to give us some more feedback.

 Christian

 [1] http://www.w3.org/TR/xpath-functions/#func-collection
 [2] http://docs.basex.org/wiki/Databases
 ___

  Hi Guys,
 
  I was trying to use the fn:collectino function today, but I am having
  trouble understanding the implementation. From past experience,
  fn:collection($uri) (where $uri points to a document with a list of docs
  within it) has returned a sequence of all the documents. It appears that
  BaseX does not implement it this way.
 
  I have a database where I am storing a large number of documents, and I
  would like to have several collections inside which are subsets of these
  documents. I need to open these subsets quickly, and I thought
 fn:collection
  would allow me to do this. Is there any way to easily open these subsets
  using fn:collection (or a similar high performance function)? As of right
  now I am solving the problem by using a for loop to traverse the
 documents,
  but this is not fast enough for my needs.
 
  Thanks,
 
  Jeremy
 
  ___
  BaseX-Talk mailing list
  BaseX-Talk@mailman.uni-konstanz.de
  https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
 

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Implementation of fn:collection

2013-10-17 Thread Christian Grün
Hi Jeremy,

 Is there a method to
 open a subset of documents (not distinguishable by path) in the database
 with performance similar to calling db:open($db_name)?

Is there any criteria regarding the documents you want to open? If you
simply want to choose the first 10 documents, you could try a position
filter:

   collection(db)[position() = 1 to 10]

Talking about performance: fn:collection and db:open are based on the
same code, so there shouldn’t be any difference.

Hope this helps,
Christian
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Implementation of fn:collection

2013-10-17 Thread Jeremy Moseley
Yes, sorry I should have specified the criteria. I have a list of a subset
of the documents in the database that need to be opened (I can store this
list in any form necessary), but I am experiencing performance problems
since I need to iterate over the list in order to filter or choose which
documents to open.

Thanks,

Jeremy


On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün
christian.gr...@gmail.comwrote:

 Hi Jeremy,

  Is there a method to
  open a subset of documents (not distinguishable by path) in the database
  with performance similar to calling db:open($db_name)?

 Is there any criteria regarding the documents you want to open? If you
 simply want to choose the first 10 documents, you could try a position
 filter:

collection(db)[position() = 1 to 10]

 Talking about performance: fn:collection and db:open are based on the
 same code, so there shouldn’t be any difference.

 Hope this helps,
 Christian

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Implementation of fn:collection

2013-10-17 Thread Christian Grün
Hi Jeremy,

if the list is more or less arbitrary, then you’ll indeed have to
browse all your documents in order to find the ones that are relevant
you. One approach could be to specify a filtering predicate:

let $paths := (a.xml, b.xml)
return db:open(db)[db:path(.) = $paths]

If this is too slow, string comparisons can be sped up by using a map,
as recently proposed on this list:

let $paths := (a.xml, b.xml)
let $map := map:new( $paths ! { . : true() })
return db:open(db)[$map(db:path(.))]

How many documents are stored in your database?

Best,
Christian
___

 Yes, sorry I should have specified the criteria. I have a list of a subset
 of the documents in the database that need to be opened (I can store this
 list in any form necessary), but I am experiencing performance problems
 since I need to iterate over the list in order to filter or choose which
 documents to open.

 Thanks,

 Jeremy


 On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün christian.gr...@gmail.com
 wrote:

 Hi Jeremy,

  Is there a method to
  open a subset of documents (not distinguishable by path) in the database
  with performance similar to calling db:open($db_name)?

 Is there any criteria regarding the documents you want to open? If you
 simply want to choose the first 10 documents, you could try a position
 filter:

collection(db)[position() = 1 to 10]

 Talking about performance: fn:collection and db:open are based on the
 same code, so there shouldn’t be any difference.

 Hope this helps,
 Christian


___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Implementation of fn:collection

2013-10-17 Thread Jeremy Moseley
I am dealing with collections of documents in the 2500+ range. The code you
suggested is similar to what I have already, but much cleaner. I'm assuming
the lookup within the map is done in constant time?

Cheers,

Jeremy


On Thu, Oct 17, 2013 at 10:41 AM, Christian Grün
christian.gr...@gmail.comwrote:

 Hi Jeremy,

 if the list is more or less arbitrary, then you’ll indeed have to
 browse all your documents in order to find the ones that are relevant
 you. One approach could be to specify a filtering predicate:

 let $paths := (a.xml, b.xml)
 return db:open(db)[db:path(.) = $paths]

 If this is too slow, string comparisons can be sped up by using a map,
 as recently proposed on this list:

 let $paths := (a.xml, b.xml)
 let $map := map:new( $paths ! { . : true() })
 return db:open(db)[$map(db:path(.))]

 How many documents are stored in your database?

 Best,
 Christian
 ___

  Yes, sorry I should have specified the criteria. I have a list of a
 subset
  of the documents in the database that need to be opened (I can store this
  list in any form necessary), but I am experiencing performance problems
  since I need to iterate over the list in order to filter or choose which
  documents to open.
 
  Thanks,
 
  Jeremy
 
 
  On Thu, Oct 17, 2013 at 10:18 AM, Christian Grün 
 christian.gr...@gmail.com
  wrote:
 
  Hi Jeremy,
 
   Is there a method to
   open a subset of documents (not distinguishable by path) in the
 database
   with performance similar to calling db:open($db_name)?
 
  Is there any criteria regarding the documents you want to open? If you
  simply want to choose the first 10 documents, you could try a position
  filter:
 
 collection(db)[position() = 1 to 10]
 
  Talking about performance: fn:collection and db:open are based on the
  same code, so there shouldn’t be any difference.
 
  Hope this helps,
  Christian
 
 

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk