Re: [xml] Xml Question

2019-07-05 Thread Eric Eberhard
Oh -- if smaller file here is some cheap code that works fine.  You will
have to create a new document for each smaller pieces and then copy the
pieces over like so:

for (cur=fromwrk->cur;cur;cur=cur->next) {   
 tmp = xmlCopyNode(cur,1);
 xmlAddChild(towrk->cur,tmp); 
 }

>From being you original file and cur being your current little file.

E

-Original Message-
From: xml [mailto:xml-boun...@gnome.org] On Behalf Of Eric Eberhard
Sent: Friday, July 05, 2019 12:19 PM
To: 'Liam R E Quin' ; 'Ashjan Alsulaimani'
; xml@gnome.org
Subject: Re: [xml] Xml Question

Dear Ashjan,

If it was me I'd do it the cheap way and not use the parser.  Get the file
and then read through it with your favorite language and look for starting
tags you want moved, then scan until you hit the ending tag, write that out.
Rinse and repeat.  You can use the parser on each piece you write out.

It is surely possible to do it in both ways described and I know of other
that works on small files.  But this is a LOT easier.

Eric

-Original Message-
From: xml [mailto:xml-boun...@gnome.org] On Behalf Of Liam R E Quin
Sent: Thursday, July 04, 2019 6:28 AM
To: Ashjan Alsulaimani ; xml@gnome.org
Subject: Re: [xml] Xml Question

On Thu, 2019-07-04 at 10:33 +0100, Ashjan Alsulaimani wrote:
> 
> 
> What's the best way to approach such a task and the most efficient way 
> as I'm dealing with Medline database!

If your input files are a few hundred megabytes or less, start with the XSLT
identity transform and add empty templates to match what you want to delete.

If your input is over a gigabyte (say) or you do lots of different subsets
of the same document, you may find XQuery update works better for you, with
a databaase (e.g. BaseX or eXistb).

Liam


--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Upcoming courses: DocBook (sold out); CSS for XML People

___
xml mailing list, project page  http://xmlsoft.org/ xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


___
xml mailing list, project page  http://xmlsoft.org/ xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Xml Question

2019-07-05 Thread Eric Eberhard
Your answer is spot on.  I don't know if he has markup and CDATA or if his 
files are large.  If none of those are true, cheap is good :-)  If it is a gig 
file with CDATA and markup, cheap would be bad.

E

-Original Message-
From: Liam R E Quin [mailto:l...@holoweb.net] 
Sent: Friday, July 05, 2019 2:24 PM
To: Eric Eberhard ; 'Ashjan Alsulaimani' ; 
xml@gnome.org
Subject: Re: [xml] Xml Question

On Fri, 2019-07-05 at 12:18 -0700, Eric Eberhard wrote:
> Dear Ashjan,
> 
> If it was me I'd do it the cheap way and not use the parser. 

Make sure to handle markup in comments and CDATA sections properly,and to 
process external files included with XInclude or by entities defined in the DTD.

Working with XML at the text level can be reasonably safe if you know the input 
files well, and yes, i sometimes do it too, but cheap isn't the same as good :)

Liam


--
Liam Quin, https://www.delightfulcomputing.com/

Upcoming course:   CSS for XML People, Rockville MD, August 2019
   See https://www.delightfulcomputing.com/



___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Xml Question

2019-07-05 Thread Liam R E Quin
On Fri, 2019-07-05 at 12:18 -0700, Eric Eberhard wrote:
> Dear Ashjan,
> 
> If it was me I'd do it the cheap way and not use the parser. 

Make sure to handle markup in comments and CDATA sections properly,and
to process external files included with XInclude or by entities defined
in the DTD.

Working with XML at the text level can be reasonably safe if you know
the input files well, and yes, i sometimes do it too, but cheap isn't
the same as good :)

Liam


-- 
Liam Quin, https://www.delightfulcomputing.com/

Upcoming course:   CSS for XML People, Rockville MD, August 2019
   See https://www.delightfulcomputing.com/

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Xml Question

2019-07-05 Thread Eric Eberhard
Dear Ashjan,

If it was me I'd do it the cheap way and not use the parser.  Get the file
and then read through it with your favorite language and look for starting
tags you want moved, then scan until you hit the ending tag, write that out.
Rinse and repeat.  You can use the parser on each piece you write out.

It is surely possible to do it in both ways described and I know of other
that works on small files.  But this is a LOT easier.

Eric

-Original Message-
From: xml [mailto:xml-boun...@gnome.org] On Behalf Of Liam R E Quin
Sent: Thursday, July 04, 2019 6:28 AM
To: Ashjan Alsulaimani ; xml@gnome.org
Subject: Re: [xml] Xml Question

On Thu, 2019-07-04 at 10:33 +0100, Ashjan Alsulaimani wrote:
> 
> 
> What's the best way to approach such a task and the most efficient way 
> as I'm dealing with Medline database!

If your input files are a few hundred megabytes or less, start with the XSLT
identity transform and add empty templates to match what you want to delete.

If your input is over a gigabyte (say) or you do lots of different subsets
of the same document, you may find XQuery update works better for you, with
a databaase (e.g. BaseX or eXistb).

Liam


--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Upcoming courses: DocBook (sold out); CSS for XML People

___
xml mailing list, project page  http://xmlsoft.org/ xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Xml Question

2019-07-04 Thread Liam R E Quin
On Thu, 2019-07-04 at 10:33 +0100, Ashjan Alsulaimani wrote:
> 
> 
> What's the best way to approach such a task and the most efficient
> way as I'm dealing with Medline database!

If your input files are a few hundred megabytes or less, start with the
XSLT identity transform and add empty templates to match what you want
to delete.

If your input is over a gigabyte (say) or you do lots of different
subsets of the same document, you may find XQuery update works better
for you, with a databaase (e.g. BaseX or eXistb).

Liam


-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Upcoming courses: DocBook (sold out); CSS for XML People

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Xml Question

2019-07-04 Thread Shlomi Fish
Hi,

On Thu, 4 Jul 2019 10:33:39 +0100
Ashjan Alsulaimani  wrote:

> Dear XMLers
> 
> I have an urgent question and I am not sure if you want to answer me or not
> but lets try...
> 
> my question is I would like to create a subset of xmls files from an
> original xmls while keeping the same structure. You can say I would like to
> filter to have a subsets.
> 
> What's the best way to approach such a task and the most efficient way as
> I'm dealing with Medline database!
> 

Try https://en.wikipedia.org/wiki/XSLT or https://en.wikipedia.org/wiki/XPath
and DOM.

> Many Thanks
> Ashjan



-- 
-
Shlomi Fish   http://www.shlomifish.org/
The Case for File Swapping - http://shlom.in/file-swap

The Knights Who Say “Ni” once said “Ni” to Chuck Norris. They are now no longer
The Knights Who Say “Ni”.
— http://www.shlomifish.org/humour/bits/facts/Chuck-Norris/

Please reply to list if it's a mailing list post - http://shlom.in/reply .
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml