RE: Efficient C++ XML validating parser?

2008-03-05 Thread ronys
Hi,

If you decide to do without validation, then I've used TinyXML in a couple
of projects, and am pretty happy with the footprint  performance.
http://www.grinninglizard.com/tinyxml/

Your right in that it will require you to write the parser manually, though,
with all that that implies.

Cheers,

Rony

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Amos Shapira
Sent: Wednesday, March 05, 2008 2:56 AM
To: Israel Linux Mailing list
Subject: Efficient C++ XML validating parser?

Hello,

Currently we use Xerces for C (http://xerces.apache.org/xerces-c/) to
read XML files but are looking at making this as efficient as
possible.

The XML files are generated by our own software so some of us though
that maybe we can get rid of validation of the input and go straight
to event handling using SAX parsers.

My concern with this approach is that it sounds like we'll end up with
a hand-written parser for very specific version of the input schema,
which will require us to keep the code in pace with changes in the
schema.

Instead, I was wondering what would be the best way to ask the XML
parser to validate the input. Maybe some tool which converts an XML
schema to tightly integrated C++ code would do the trick? I found
http://tinyurl.com/2wqqp8 but it's just a research paper (NOT free),
not open source code.

What do people around here like to use for EFFICIENT XML parsing?

Thanks,

--Amos

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Efficient C++ XML validating parser?

2008-03-05 Thread Dotan Shavit
On Wednesday 05 March 2008, Amos Shapira wrote:
 What do people around here like to use for EFFICIENT XML parsing?

A stronger machine.

Don't laugh... it may be much cheaper than developing and maintaining 
software.

#

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Efficient C++ XML validating parser?

2008-03-05 Thread ik
Two suggestions:

There are GNU/GTK (depends on how you look at it) tools for handling
XML, and one of the tools in the library is to validate (in code)
XML...

And sorry if it will sound like a troll's answer, but you can take the
native XML implementation in FPC and write it binding to C++, and then
use it with your application :)

Ido

On Wed, Mar 5, 2008 at 2:56 AM, Amos Shapira [EMAIL PROTECTED] wrote:
 Hello,

  Currently we use Xerces for C (http://xerces.apache.org/xerces-c/) to
  read XML files but are looking at making this as efficient as
  possible.

  The XML files are generated by our own software so some of us though
  that maybe we can get rid of validation of the input and go straight
  to event handling using SAX parsers.

  My concern with this approach is that it sounds like we'll end up with
  a hand-written parser for very specific version of the input schema,
  which will require us to keep the code in pace with changes in the
  schema.

  Instead, I was wondering what would be the best way to ask the XML
  parser to validate the input. Maybe some tool which converts an XML
  schema to tightly integrated C++ code would do the trick? I found
  http://tinyurl.com/2wqqp8 but it's just a research paper (NOT free),
  not open source code.

  What do people around here like to use for EFFICIENT XML parsing?

  Thanks,

  --Amos

  =
  To unsubscribe, send mail to [EMAIL PROTECTED] with
  the word unsubscribe in the message body, e.g., run the command
  echo unsubscribe | mail [EMAIL PROTECTED]





-- 
http://ik.homelinux.org/

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Efficient C++ XML validating parser?

2008-03-05 Thread Shachar Shemesh

Gilad Ben-Yossef wrote:


Amos Shapira wrote:



What do people around here like to use for EFFICIENT XML parsing?



Isn't Efficient XML an oxymoron?

Seriously, and despite the flame bait way I've introduced the subject, 
if you need to do XML parsing in a way which is more efficient then 
Xerces,  maybe it is an indication that XML is a not a proper way to 
encode you r data.

I'll bite.

Without knowing Xerces too deeply, I think you can do MUCH faster than 
it, by feeding the schema before hand. Theoretically (though, the last 
time I said this word on this list an actual project came out [1]), you 
can write a parser that receives the schema, and produces yacc (or 
bison++) output for parsing it. That would, of course, make a compiler 
compiler compiler, but who's counting? You can then take the input file, 
and follow the usual procedures for generating C++, and then binary, 
from them.


How about using a binary format which is compiled from XML ?
In this day and age, is it really all that faster? What makes XML hard 
to parse, IMHO, is not the fact that it's text, it's the fact that it's 
hierarchal.


You get all the benefits of using XML and no parsing overhead.
Well, you lose one benefit - it's no longer in a standard parsable, nor 
even textual, format.


Gilad

Shachar

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Efficient C++ XML validating parser?

2008-03-05 Thread Omer Zak
On Wed, 2008-03-05 at 12:43 +0200, Shachar Shemesh wrote:
 Gilad Ben-Yossef wrote:
  How about using a binary format which is compiled from XML ?
 In this day and age, is it really all that faster? What makes XML hard 
 to parse, IMHO, is not the fact that it's text, it's the fact that it's 
 hierarchal.
 
  You get all the benefits of using XML and no parsing overhead.
 Well, you lose one benefit - it's no longer in a standard parsable, nor 
 even textual, format.

Not necessarily.
If you have a converter between XML and your binary format, and make it
available everywhere your application is available, then the messages
would still effectively be available in XML.  You'll need also some way
to force people to modify the converter whenever they modify the schema.

Another way is to use one of the serializer/unserializer modules
available in scripting languages such as Python or Perl.  This will
transform between your data structure's internal representation and a
binary format.

--- Omer
-- 
May the holy trinity of  $_, @_ and %_ be hallowed.
My own blog is at http://www.zak.co.il/tddpirate/

My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS:  at http://www.zak.co.il/spamwarning.html


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



FSCK Error after upgrade Ubuntu 7.1

2008-03-05 Thread Lev Olshvang


Hello List,

Several days ago I asked Upgrade manager to install all available 
upgrades for my Kubuntu Desktop.

(regular instalation,  stock kernel 2.6.22-14-generic)
disk partitions:
1   /boot
2  swap
3   /
4 Extended
5 /home



It had finished successfully and required the reboot.
After reboot I got blank screen filled with  and so I used the SCD 
rescue to reinstall grub.


Several strange things happened  :

1. fsck on /  runs   only e2fsk although  I know my  /  is  ext3
2. when boot from Hard Disk, grub does not see /boot/vmlinuz (my boot 
partition is on thirst slice of HD),

   but it sees it as /vmlinuz
   The same with initrd



3. When grub finally boot the system
fsck  error on /  and was thrown into maintanence shell:
fsck 1.40.2 (12-Jul-2007)
fsck.ext3: symbol lookup error: fsck.ext3: undefined symbol: 
et_ext2_error_table
fsck.ext3: symbol lookup error: fsck.ext3: undefined symbol: 
et_ext2_error_table

fsck died with exit status 127

Since / is mounted read-only  I tried to boot from SCD and  fix 
filesystem but SCD thinks it is ext2 and OK


4 . Although I did not fix a thing ( except changing /boot/vmlinuz to 
/vmlinuz in grub menu and canceling fsck check in fstab)

  my system continue boot after exiting from maintanence shell

5 . The system does not see  / (/dev/sda3) as mounted anymore

What goes wrong ?, why SCD rescue and Ubuntu disagree on  /  filesystem 
type and state ?


Regards,
Lev



=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



[HAIFUX Lecture] Advanced Linux Kernel Networking - Neighboring Subsystem; IPSec by Rami Rosen

2008-03-05 Thread Orr Dunkelman
Next Monday, 10th of March, at 18:30 the Haifa Linux Club, will meet
again and enjoy the lecture of Rami Rosen about

   Advanced Linux Kernel Networking - Neighboring Subsystem; IPSec

abstract:
* We will deal with arp table and arp cache,
* Reachability and Network Unreachability Detection,
* Reachability states (NUD_REACHABLE , NUD_NONE or NUD_FAILED, and more)
* Garbage collection with the Neighboring Subsystem.
* Interaction between Linux Kernel Routing Subsystem and Linux
* Kernel Neighboring Subsystem.
* IPsec - tunnels and encryption , xfrm API , IPsec flow cache.

==

We meet in Taub building, room 6. For instructions see:
http://www.haifux.org/where.html

Attendance is free, and you are all invited!

==

Future Lectures:

How Ethernet works  Nir
Abulaffo24/3/2008
Advanced Linux Kernel Networking - third lecture Rami Rosen   7/4/2008


We are always interested in hearing your talks and ideas. If you wish
to give a talk, hold a discussion, or just plan some event haifux
might be interested in, please contact us at [EMAIL PROTECTED]

-- 
Orr Dunkelman,
[EMAIL PROTECTED]

Any human thing supposed to be complete, must for that reason infallibly
be faulty -- Herman Melville, Moby Dick.

GPG fingerprint: C2D5 C6D6 9A24 9A95 C5B3  2023 6CAB 4A7C B73F D0AA
(This key will never sign Emails, only other PGP keys. The key
corresponds to [EMAIL PROTECTED])

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Next Perl-IL Meeting - Sunday, 16-March-2008

2008-03-05 Thread Shlomi Fish
The Israeli Perl Mongers ( http://www.perl.org.il/ ) are going to have their 
first meeting for 2008 on Sunday, 16-March-2008. The place will be the 
Schreiber MathCS building, room 008 (entrance floor), Tel Aviv University. 
More information on how to get there can be found here:

http://www.cs.tau.ac.il/telux/

The meeeting will start at 18:30. The tentative schedule is:

1. Ran Eilam - Config::* - the Alenby St. of CPAN. (30 minutes)

2. Fallback talk if there are no other talks: Shlomi Fish - 
http://www.shlomifish.org/lecture/Perl/Lightning/Too-Many-Ways/ . (15-30 
minutes)

-

Some more talks are being negotiated and we will be happy to host your talk on 
a subject you find interesting.

Sharpen your perls and mark your calendars because a new season of Perl 
meetings is about to start.

Regards,

Shlomi Fish

-
Shlomi Fish  [EMAIL PROTECTED]
Homepage:http://www.shlomifish.org/

I'm not an actor - I just play one on T.V.

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Efficient C++ XML validating parser?

2008-03-05 Thread Amos Shapira
On Wed, Mar 5, 2008 at 9:43 PM, Shachar Shemesh [EMAIL PROTECTED] wrote:
 Gilad Ben-Yossef wrote:

   Amos Shapira wrote:
  
  
   What do people around here like to use for EFFICIENT XML parsing?
  
  
   Isn't Efficient XML an oxymoron?
  
   Seriously, and despite the flame bait way I've introduced the subject,
   if you need to do XML parsing in a way which is more efficient then
   Xerces,  maybe it is an indication that XML is a not a proper way to
   encode you r data.
  I'll bite.

Thanks to everyone for your answers.

I'm replying to Shachar's reply because his is the closest to what I
have to add to this, plus some more info about my question as I
learned since I sent it.


  Without knowing Xerces too deeply, I think you can do MUCH faster than
  it, by feeding the schema before hand. Theoretically (though, the last

Xerces is apparently the Lincoln of XML parsers i.e. it supports
everything there is to support in the standard but it comes with a
huge weight attached to it. On my desktop it's the 9th largest library
at almost 4Mb, comes just before libkhtml and twice the size of libc.
But library size is not all I can say against it - it adheres to the
standard approach of DOM (tons of object, lots of memory) or SAX (i.e.
have to manually handle each event in the code which uses SAX).

There are a few newer approaches to parse XML files, there is a pretty
good list at http://en.wikipedia.org/wiki/Xml_parser#Processing_XML_files

The one that appeals the most to me is Data Binding
(http://en.wikipedia.org/wiki/Xml_parser#Data_binding), i.e., as
Shachar describes below - it's based on a program which reads the
schema and builds code (in my case, C++ class) which reads files of
this specific schema, its objects are strongly-typed in-memory
representations of the data in the XML file and provide convenient
accessors.

Presumebly, because these classes are schema-specific, they can cut a
lot of checks for irrelevant execution paths.

If you ever wrote XDR/RPC stuff (I'm talking about the stuff the NFS
and friends uses for network-level representation) then it might be
something similar - it used to have a program to convert language
independent data representation to various language-specific
implementations of classes to marshal and demarshal data (only I
forgot the name of the XDR compiler right now).

The snag about Data Binding is that all the implementations I found so
far are either for Java or Proprietary and cost a fortune (thousands
of dollars per developer seat, where you have to buy a license for
every developer who links his code with the output of the programs).

Ah - and our final programs (the ones we ship to customers) have to
support all sorts of UNIX variants, and Windows, not just Linux.

The only one which keeps our hopes alive is xmlbeanscxx
(http://xmlbeansxx.touk.pl/). I'm struggling with getting it to
compile and run for now.

Another one is CodeSynthesis XSD
(http://www.codesynthesis.com/products/xsd/), it's GPL so we we can't
link it with our proprietary code.

Here is a pretty complete list of XML Data Binding resources, almost
all options for C/C++ are commercial:
http://www.rpbourret.com/xml/XMLDataBinding.htm

Thanks again for everyone's input.

--Amos

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



To maintainers of Israeli open source web sites

2008-03-05 Thread Amos Shapira
Hi All,

It just took me quite a while to find current archives of linux-il on
the net in order to point to an article there.

I finally found a good archive at mail-archive
(http://www.mail-archive.com/linux-il@cs.huji.ac.il/).

I know this question sometimes returns to haunt the list but not being
able to find the archives causes a sort of a chicken-and-egg problem,
doesn't it? :)

So please - if you maintain a web site which mentions Linux-IL please
add a pointer to its archives (mail-archive looks good) to your web
site so Google can find it more easily.

Web sites I looked through include linux.org.il, hamakor.org.il (which
has a non-current archive) and http://linuxil.objectis.net/,
iglu.org.il, whatsup, linmagazine

Thanks,

--Amos

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: To maintainers of Israeli open source web sites

2008-03-05 Thread Dotan Cohen
On 06/03/2008, Amos Shapira [EMAIL PROTECTED] wrote:
 Hi All,

  It just took me quite a while to find current archives of linux-il on
  the net in order to point to an article there.

  I finally found a good archive at mail-archive
  (http://www.mail-archive.com/linux-il@cs.huji.ac.il/).

  I know this question sometimes returns to haunt the list but not being
  able to find the archives causes a sort of a chicken-and-egg problem,
  doesn't it? :)

  So please - if you maintain a web site which mentions Linux-IL please
  add a pointer to its archives (mail-archive looks good) to your web
  site so Google can find it more easily.

  Web sites I looked through include linux.org.il, hamakor.org.il (which
  has a non-current archive) and http://linuxil.objectis.net/,
  iglu.org.il, whatsup, linmagazine

  Thanks,

  --Amos


I will do that this evening. Are there any other high-value sites that
maybe we should be promoting? Of course I'd love to add my own
gibberish.co.il (shameless plug) but if someone wants to publish their
list of important Hebrew or Israeli Linux websites I will pick a few
and link to them as well.

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


OpenOffice in educational institutes in Israel?

2008-03-05 Thread Boaz Rymland

This might interest a few people here:


http://it.themarker.com/tmit/article/2992


Could it *be it*, finally? There have been such plays in the past, 
usually to attract attention and get lower prices from M$. Seems like it 
is a real first step, and a considerable one(?).



If so, its a nice way to start the morning :-)

Boaz.


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]