Re: [basex-talk] Accessing DOCTYPE information after DB creation?

2014-03-28 Thread France Baril
We moved to schemas, this way I don't lose the schema declaration and users
who edit documents from Oxygen (webdav connection) can get all the
advantages of editing documents that are linked to their model, including
suggestions for enumerated attributes and indent that respect spacing for
mixed content.


On Fri, Mar 28, 2014 at 7:31 AM, Imsieke, Gerrit, le-tex <
gerrit.imsi...@le-tex.de> wrote:

> You can preprocess your documents with Andrew Welch’s LexEv parser:
> http://andrewjwelch.com/lexev/
>
>
> On 28.03.2014 12:25, Christian Grün wrote:
>
>> Hi Constantine,
>>
>> unfortunately no, because this information is already consumed by the
>> XML parser (i. e., we don’t get to see it at all when the database is
>> being built).
>>
>> Suggestions from other users with similar problems are welcome.
>> Christian
>>
>>
>>  Hi all,
>>>
>>> I would really like to be able to query a large corpus of documents to
>>> get
>>> names and counts of the DTDs which are declared in the (somewhat
>>> old-fashioned now) DOCTYPE declaration:
>>>
>>> 
>>> >> version
>>> 4.5.2//EN//XML" "art452.dtd" [
>>> ]>
>>>  
>>>
>>> Is there any way to get BaseX to preserve this information? Can I rewrite
>>> the doctype declaration into some sort of element node as the DB is being
>>> created so that this info can be queried?
>>>
>>> Thanks for any tips,
>>> Constantine.
>>>
>>>  ___
> BaseX-Talk mailing list
> BaseX-Talk@mailman.uni-konstanz.de
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>



-- 
France Baril
Architecte documentaire / Documentation architect
france.ba...@architextus.com
(514) 572-0341
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Accessing DOCTYPE information after DB creation?

2014-03-28 Thread Hondros, Constantine (ELS-AMS)
Thanks all,

Unfortunately this is legacy content – and there is an unbelievable amount of 
it too.

So, I will probably pre-process the content and write the DTD info out into an 
element or PI node.

org.basex.core.Command.setInput(org.xml.sax.InputSource is) looks like a 
probable place to do it, after a _very_ quick look at the JavaDoc. However, I 
am reading the XML from tarfiles using relatively new functionality, so if 
anyone from the BaseX team wants to suggest an approach to pre-processing XML 
files from within tarfiles, I would appreciate a few words from you.

BaseX rocks!

Cheers,
Constantine

From: basex-talk-boun...@mailman.uni-konstanz.de 
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of France Baril
Sent: 28 March 2014 15:16
To: Imsieke, Gerrit, le-tex
Cc: BaseX
Subject: Re: [basex-talk] Accessing DOCTYPE information after DB creation?

We moved to schemas, this way I don't lose the schema declaration and users who 
edit documents from Oxygen (webdav connection) can get all the advantages of 
editing documents that are linked to their model, including suggestions for 
enumerated attributes and indent that respect spacing for mixed content.

On Fri, Mar 28, 2014 at 7:31 AM, Imsieke, Gerrit, le-tex 
mailto:gerrit.imsi...@le-tex.de>> wrote:
You can preprocess your documents with Andrew Welch’s LexEv parser: 
http://andrewjwelch.com/lexev/


On 28.03.2014 12:25, Christian Grün wrote:
Hi Constantine,

unfortunately no, because this information is already consumed by the
XML parser (i. e., we don’t get to see it at all when the database is
being built).

Suggestions from other users with similar problems are welcome.
Christian

Hi all,

I would really like to be able to query a large corpus of documents to get
names and counts of the DTDs which are declared in the (somewhat
old-fashioned now) DOCTYPE declaration:



 

Is there any way to get BaseX to preserve this information? Can I rewrite
the doctype declaration into some sort of element node as the DB is being
created so that this info can be queried?

Thanks for any tips,
Constantine.
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de<mailto:BaseX-Talk@mailman.uni-konstanz.de>
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk



--
France Baril
Architecte documentaire / Documentation architect
france.ba...@architextus.com<mailto:france.ba...@architextus.com>
(514) 572-0341



Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The 
Netherlands, Registration No. 33156677, Registered in The Netherlands.
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Accessing DOCTYPE information after DB creation?

2014-03-28 Thread Imsieke, Gerrit, le-tex
You can preprocess your documents with Andrew Welch’s LexEv parser: 
http://andrewjwelch.com/lexev/


On 28.03.2014 12:25, Christian Grün wrote:

Hi Constantine,

unfortunately no, because this information is already consumed by the
XML parser (i. e., we don’t get to see it at all when the database is
being built).

Suggestions from other users with similar problems are welcome.
Christian



Hi all,

I would really like to be able to query a large corpus of documents to get
names and counts of the DTDs which are declared in the (somewhat
old-fashioned now) DOCTYPE declaration:



 

Is there any way to get BaseX to preserve this information? Can I rewrite
the doctype declaration into some sort of element node as the DB is being
created so that this info can be queried?

Thanks for any tips,
Constantine.


___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Accessing DOCTYPE information after DB creation?

2014-03-28 Thread Christian Grün
Hi Constantine,

unfortunately no, because this information is already consumed by the
XML parser (i. e., we don’t get to see it at all when the database is
being built).

Suggestions from other users with similar problems are welcome.
Christian


> Hi all,
>
> I would really like to be able to query a large corpus of documents to get
> names and counts of the DTDs which are declared in the (somewhat
> old-fashioned now) DOCTYPE declaration:
>
> 
>  4.5.2//EN//XML" "art452.dtd" [
> ]>
>  
>
> Is there any way to get BaseX to preserve this information? Can I rewrite
> the doctype declaration into some sort of element node as the DB is being
> created so that this info can be queried?
>
> Thanks for any tips,
> Constantine.
>
> 
>
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
> Netherlands, Registration No. 33156677, Registered in The Netherlands.
>
>
> ___
> BaseX-Talk mailing list
> BaseX-Talk@mailman.uni-konstanz.de
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk