Re: [xml] setting URL for xmlRelaxNGParserCtxt?

2005-01-26 Thread Daniel Veillard
On Wed, Jan 26, 2005 at 12:42:57PM +0100, Martijn Faassen wrote:
 Hey Daniel,
 
 I hope you'll still answer the other part of my mail (the Relax NG 
 include processing errors).. that's a bit more urgent right now. :)

  Well that mean going in a debug session (gdb etc...) and 1/ it takes
more time 2/ I don't have a reproductible test case 

Daniel

-- 
Daniel Veillard  | Red Hat Desktop team http://redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Using wchar_t string with libxml2.

2005-01-26 Thread Arthur_Yarwood

[EMAIL PROTECTED] wrote on 26/01/2005 10:18:34:

 On Wed, Jan 26, 2005 at 10:05:10AM +, [EMAIL PROTECTED]
wrote:
  How should I be passing my wide strings into these functions?
I could just 
  convert them to char strings, but I could loose data. Are the
any other 
  issues I should be concerned about to ensure my xml is valid?
e.g. what 
  encoding standard should I use? UTF-16?
 
  You must convert your wide strings into UTF-8 strings before
passing them
 to the libxml2 API. I think 
   http://xmlsoft.org/encoding.html#internal
 is clear about that. If not I take patches to make this clearer.
 

Ok, does that mean there could possibly be a loss
of data, for characters that need two bytes? Or have I got the wrong end
of the stick when it comes to storing Unicode strings?

I've just tried using 'wcstombs()' to convert my wide
string to a multi-byte string, which I can then BAD_CAST to xmlChar. Is
this the correct way of doing this? I just tried passing a random arabic
string through wcstombs() and it fails, because it cannot a wide character.


I also tried using iconv (although I wasn't sure what
my input code should be utf-16?), this also failed, giving me EILSEQ, again
a character it cannot convert. 

How should I convert my random arabic string to utf-8?
Is it possible? 

Thanks, 

Arthur.


**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
[EMAIL PROTECTED]

This footnote also confirms that this email message has been checked
for all known viruses.

**
Sony Computer Entertainment Europe

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Using wchar_t string with libxml2.

2005-01-26 Thread Daniel Veillard
On Wed, Jan 26, 2005 at 02:05:31PM +, Arthur Yarwood wrote:
 [EMAIL PROTECTED] wrote on 26/01/2005 12:48:07:
 
 Am I just mis-using it somehow?

  Looks okay. But this is not a mailing list about characters encoding
or the iconv API, really...

 I'm guessing I've got the wrong code for 
 the input. Any ideas?

  http://www.opengroup.org/onlinepubs/007908799/xsh/stddef.h.html
  the definition of wchar_t doesn't define an encoding. Seems to be
a sequence of code point. Doesn't even define the *size* of the C objects,
maybe your encoding is UCS4 but you really ought to know better what
you are using. It's likely to even be system dependant ! Anyway if you
use embedded systems that sounds a sure way to waste globs of memory,
I would not do that...
  UTF-8 representation is defined in rfc2044 . 
  libxml2 has a function to take an Unicode code point and write it
in an xmlChar buffer:
  http://xmlsoft.org/html/libxml-parserInternals.html#xmlCopyCharMultiByte

 Ahh, yeah I forgot that they append that. I do apologise, I'll use my 
 own personal email account from now on.

  okay.

Daniel

-- 
Daniel Veillard  | Red Hat Desktop team http://redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] setting URL for xmlRelaxNGParserCtxt?

2005-01-26 Thread Daniel Veillard
On Wed, Jan 26, 2005 at 03:12:52PM +0100, Martijn Faassen wrote:
 So what *is* stored in these dictionaries? I still don't know. Tagnames? 
 Namespace strings? Text node content? IDs? All of them? I guess I'll 
 have to study the source to get the answer. :)

  markup tag name, very small text node values, ID/REFs, DTD attribute
defaults values, namespace names. With libxslt you also get stylesheets
names.
  general text node content is not added, this would explode and be unusable.

 If one blows away a dictionary once every while, what happens to the 
 things referencing things inside it?

  they will point to freed memory. So don't free the dictionnary until
it it not in use anymore. Use another one, but you will loose unicity
of strings.

Daniel

-- 
Daniel Veillard  | Red Hat Desktop team http://redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] setting URL for xmlRelaxNGParserCtxt?

2005-01-26 Thread Daniel Veillard
On Wed, Jan 26, 2005 at 05:39:56PM +0100, Martijn Faassen wrote:
 Daniel Veillard wrote:
 On Wed, Jan 26, 2005 at 03:12:52PM +0100, Martijn Faassen wrote:
 
 So what *is* stored in these dictionaries? I still don't know. Tagnames? 
 Namespace strings? Text node content? IDs? All of them? I guess I'll 
 have to study the source to get the answer. :)
 
 
   markup tag name, very small text node values, ID/REFs, DTD attribute
 defaults values, namespace names. With libxslt you also get stylesheets
 names.
   general text node content is not added, this would explode and be 
   unusable.
 
 Okay, thanks. Even if that memory is not freed ever it isn't too bad. I 
 think I understand also now why you mention IDs, as they may be globally 
 unique strings and there might be many of them. Does namespace names 
 mean their prefixes or the href, or both?

  both,

 It might be interesting for me to try building something on top of the 
 dictionary that that caches Python unicode strings so that they don't 
 need to be regenerated all the time. Basically, if I understand it 
 correctly, dictionaries guarantee that there is only a single char* 
 pointer to a piece of textual data, so I could use that pointer as a 

  yes unicity of the pointer returned by the API is the main garantee.
(note that ptr+1 may not be unique as boo and foo will be stored on
different locations).

   they will point to freed memory. So don't free the dictionnary until
 it it not in use anymore. Use another one, but you will loose unicity
 of strings.
 
 Hm, that sounds tricky. If I have a bunch of documents that share the 
 same dictionary, how would I go ahead and clean a dictionary up? One way 
 would be to hunt all references to dictionaries and replace the 
 dictionary with another one. The other way would be to clean or shrink 
 the dictionary itself.

  You can remove the dictionnary only when no more document reference it.
trying tou change dynamically the dictionnary of a document would be expensive
and very tricky,

 Both approaches have a problem I can't seem to figure my way out of:

  So don't do it.

 The strings in the original dictionary (or the strings not known to the 
 dictionary anymore if the dictionary has been 'shrunk') will still be 

  You can't 'shrunk' a dictionnary, there is no way you can tell whether
a given string need to be kept or discarded.
  But you can ask a dictionnary if it owns a string pointer really fast.

Daniel

-- 
Daniel Veillard  | Red Hat Desktop team http://redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] Win32 crash in xmlGetGlobalState

2005-01-26 Thread James Smith








Hi all,



Im having a bit of a problem using libxml2 (version 2.6.17)
with multiple threads under Windows (VC++ 6, SP6), and I was wondering if
anyone could shed any light on what the issue might be.



Im trying to use xmlDocDumpMemoryEnc() from a child
thread in my windows app, using the libxml2 DLL. Basically, it crashes inside
this function when I call it  Ive compiled myself a debug version
of libxml2 from the source, and discovered that the function that causes the
crash is xmlGetGlobalState() in __xmlDefaultBufferSize(), which is used when
allocating the memory to dump to (in xmlBufferCreate()). Ive looked back
through the archives, and tried the various things I could find that could be
related:



1)
Im calling xmlInitParser() at the very
beginning of my app, as instructed in the Thread Safety page on
xmlsoft.org



2)
Compiling libxml2 with different library versions
(i.e. Multithreaded DLL, Multithreaded debug DLL) doesnt make any
difference. Both libxml2 and all my code are compiled with /MDd, and I still
have the problem.



3)
Compiling libxml2 with different thread types doesnt
help either (native threads are the only ones that work).



Does anyone know if there are any basic steps that Ive
missed that would cause libxml2 to crash when getting the global state from inside
a child thread?



cheers,



James Smith  [EMAIL PROTECTED]






___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] setting URL for xmlRelaxNGParserCtxt?

2005-01-26 Thread Martijn Faassen
Daniel Veillard wrote:
On Wed, Jan 26, 2005 at 05:39:56PM +0100, Martijn Faassen wrote:
[snip]
Both approaches have a problem I can't seem to figure my way out of:
  So don't do it.
I'm not going to do it any time soon, just running it by you to see 
whether I comprehend it now or whether there was an approach I missed. 
It looks rather hopeless to periodically clean up dictionaries in the 
library I'm building, so I'll just hope it won't become a big issue. I 
don't expect it to start taking up a lot of space except in your 
aforementioned ID case. There aren't that many different tagnames and 
namespace URIs in the world, after all.

The strings in the original dictionary (or the strings not known to the 
dictionary anymore if the dictionary has been 'shrunk') will still be 
  You can't 'shrunk' a dictionnary, there is no way you can tell whether
a given string need to be kept or discarded.
Right, I said it longer but that's my conclusion.
  But you can ask a dictionnary if it owns a string pointer really fast.
It's definitely an interesting thing to consider for a fast Python 
string lookup hash table.

Thanks for the info!
Regards,
Martijn
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] win32 build system

2005-01-26 Thread Daniel Veillard
On Wed, Jan 26, 2005 at 07:55:16PM +0100, Francesco Montorsi wrote:
 Hi,
 
  How much of it is the source, how much of it is generated.
 Only two files are templates:
 - libxml2.bkl
 - bakefiles.bkgen
 all the other ones are generated using them.

  okay

 It seems
 reasonable to me to add a maintain the bakefile source to the archive,
 associated with a README explaining how to use this alternate mechanism,
 and then anybody willing to use that instead of whatever the default
 build is should be able to follow the guideline and use the bakefile.
 Ok; I created a simple Readme.txt and I have updated the zip
 (always at http://www.geocities.com/f18m_cpp217828/prog/libxml2_win32.zip)
 
 
 However:
  - adding the .zip file to the distribution does not make sense to me
 I agree; I meant to say that the contents of the zip could be added... :-)
 
  - nor adding all the generated makefiles
 why ? This would make much easier to everyone to use the build system.

  Because the official Windows build is Igor's one. Igor takes most of the
Windows issues and problems. If we clearly put a different build out then
people will start push request on him while they use a different build
system. And as I could see, on Windows new build way == new bugs, c.f.
for example the bug the Joel and William have been chasing so far unsuccess-
fully just because they changed a /MD onto /MT on Microsoft compiler.
I want the people who use bakefile to at least have gone though the
readme, the bakefile install, before hitting any potential bug, i.e.
those people can work and debug at least minimally on their own.

 If only the templates are given, then before compiling the user should 
 download
 bakefile, install it and run it.

  Precisely.

 The size of the BUILD folder is about 317 Kb and I think it wouldn't be a 
 great problem adding it entirely...

  It's less a question of size, and more a question of making sure
people building this way know what they are doing. Bakefile is far from 
ubiquitous, I'm fine suggestion it as a solution, but we are not ready
to cope with problems resulting from its use :-)

 so if you indicate clearly what the source is, provide a README and
 indicate willingness to provide updates, I will add them in a bakefile
 subdirectory of the sources.
 My project depends on libxml2 and thus I will continue to use it, through 
 bakefile
 makefiles; so I will continue to keep the bakefile updated and bug-free :-)
 To work, the bakefile  all generated makefiles should be put into
 a subfolder (the name is not important; I usually use a folder called 
 build)

  If I add those 3 files (libxml2.bkl, bakefiles.bkgen, Readme.txt) I will
put them under bakefile, build is far too generic.

Daniel

-- 
Daniel Veillard  | Red Hat Desktop team http://redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] Error after upgrade to libxslt 1.1.12

2005-01-26 Thread James Orr
Hi,

I've been using libxslt 1.1.9 with Perl's XML::LibXSLT module.  Ran into
a bit of trouble when upgrading to 1.1.12 ...

I have a registered function which returns an XML::LibXML::NodeList.  It
runs through an array, creates an element for each item and then pushs
it onto the returned nodelist.

In my XSL file I have some things like ...
select=item[ns:myfunction($include)[EMAIL PROTECTED]

and what that would do is select all the item elements with a value
which matches entries in the perl array.  This works fine in 1.1.9.  In
1.1.12 it gives a segmentation fault.

Is this a bug with libxslt or is there something I need to clear up in
my code?

-- 
James Orr [EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Error after upgrade to libxslt 1.1.12

2005-01-26 Thread James Orr
On Wed, 2005-01-26 at 19:25 -0500, James Orr wrote:
 Hi,
 
 I've been using libxslt 1.1.9 with Perl's XML::LibXSLT module.  Ran into
 a bit of trouble when upgrading to 1.1.12 ...
 
 I have a registered function which returns an XML::LibXML::NodeList.  It
 runs through an array, creates an element for each item and then pushs
 it onto the returned nodelist.
 
 In my XSL file I have some things like ...
 select=item[ns:myfunction($include)[EMAIL PROTECTED]
 
 and what that would do is select all the item elements with a value
 which matches entries in the perl array.  This works fine in 1.1.9.  In
 1.1.12 it gives a segmentation fault.
 
 Is this a bug with libxslt or is there something I need to clear up in
 my code?

Some further information.

It's not libxslt, it's libxml2.  It works with version 2.6.11, it
doesn't with 2.6.16 or higher (not sure about in between those
versions).

I get the segmentation fault whenever myfunction returns more than one
item in the NodeList.

-- 
James Orr [EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml