Hi Yin,

I think you may be underestimating the effort involved. Hiranya, one of our
former GSoC students and now a Xerces-J committer made a very good point
last month about time constraints. It's very important to take time
constraints you'll have over the GSoC period into account in your planning,
to estimate how much time you'll need to complete the various parts of the
project, and to work with the community (which you've been doing so far) to
carve out realistic goals and a project which would be manageable for you
to complete in the time you would have.

I did start reading your proposal (would be a good idea to share it with
the mailing list when you're comfortable with what you have) and noticed
that you're only giving yourself one week to work on parseWithContext(). If
I were working on it I don't think I could finish it in that time and have
it functioning correctly (with decent performance), even with lots of
coffee and little sleep. There's a number of things to take into account
that you didn't mention (e.g. the in-scope namespaces at the context node,
entity declarations on the DocumentType node, default attributes) in your
proposal. As I've said before it's not a simple method and would encourage
you to think a bit more about the design and the time it would take to
implement it.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [email protected]
E-mail: [email protected]

Yin Lei <[email protected]> wrote on 03/23/2010 09:41:57 PM:

> Hi Michael,
> These days,i have finished some job for LSParser and function
> parseWithContext,including survey,research and coding(actually i
> have finished my GSoC proposal and sent it to you :-) ). I think
> during this summer,i can do more jobs besides LSParser and
> parseWithContext for Xerces DOM 3 part or another part , any
> advises? Thank you

> 2010/3/23 Michael Glavassevich <[email protected]>
> Hi Yin,
>
> Your example is a well-formed external parsed entity (i.e. XML
> fragment) and so should be allowed as input to parseWithContext().
>
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: [email protected]
> E-mail: [email protected]

> Yin Lei <[email protected]> wrote on 03/18/2010 09:43:43 AM:
>
>
> > Hi Michael,
> >
> > First,Thank you for your patient reply again :)
> >
> > Then,as you said,The DOM specification doesn't explicitly say it but
> > I believe the intention is that the input is a well-formed parsed
entity.
> > I think this "well-formed" in your sentence is different with XML 1.
> > 0 specification's "well-formed".
> >
> > In XML's specification, I am sure that well-formed means the
> > document should have only one root element, if it has more than one,
> > this doc is obvious not well-formed.
> >
> > But in the input parameter of function parseWithContext(), i think,
> > if the input stream is a XML fragment made up of two well-formed
> > nodes (for example:<node id="1"><name>one</name></node><node
> > id="2"><name>two</name></node>). The input parameter is also well-
> > formed and valid. Am i right ? Thank you!
>
> > 2010/3/18 Michael Glavassevich <[email protected]>
> > Hi Yin,
> >
> > parseWithContext() allows anything but the list of things that it
> > explicitly excludes, so yes elements and processing instructions are
> > allowed, but as stated by the spec things like entity declarations
> > and notation declarations are not. The DOM specification doesn't
> > explicitly say it but I believe the intention is that the input is a
> > well-formed parsed entity [1] as defined by the XML 1.0
> > specification and once parsed the node needs to be allowed at the
> > position its being inserted in. So if the result is that the
> > document would have two root elements that would be an error. And if
> > the fragment were missing a matching end-tag that would also be an
> > error. In both cases an exception should be raised.
> >
> > Thanks.
> >
> > [1] http://www.w3.org/TR/2006/REC-xml-20060816/#wf-entities
> >
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: [email protected]
> > E-mail: [email protected]
>
> > Yin Lei <[email protected]> wrote on 03/17/2010 03:42:10 AM:
> >
> > > Hi Michael,
> >
> > >
> > > Thank you for your answer first. I think i am a little confused by
> > > W3C's recommendation about function parseWithContext's parameter
> > > input of type LSInput.The recommendation is as follows:
> > >
> > > input of type LSInput
> > > The LSInput from which the source document is to be read. The source
> > > document must be an XML fragment, i.e. anything except a complete
> > > XML document (except in the case where the context node of type
> > > DOCUMENT_NODE, and the action is ACTION_REPLACE_CHILDREN), a DOCTYPE
> > > (internal subset), entity declaration(s), notation declaration(s),
> > > or XML or text declaration(s).
> > >
> > > Does it mean that the input XML fragment can only include DOCTYPE/
> > > entity declaration/notation declaration or XML or text declaration ?
> > > So,  i want to know, can this fragment include XML style file
> > > include declation( for example:<?xml-stylesheet type="text/css"
> > > href="cd_catalog.css"?>), or a XML element node(for example:<element
> > > id="1"><name>firstElement</name></element>), or some thing else?
> > >
> > > If the input include  a complete XML document indeed,what will my
> > > parseWithContext function implementation do?Throw out an exception?
> > >
> > > The last question,if parseWithContext(LSInput input, Node
> > > contextArg, short action) function has the following input
parameters:
> > > input:<element id="1"><name>firstElement</name></element>
> > > contextArg:<doc><element
id="exist1"><name>leiyin</name></element></doc>
> > > action:ACTION_INSERT_AFTER
> > > I think the function with return the DOM Node "<doc><element
> > > id="exist1"><name>leiyin</name></element></doc><element
> > > id="1"><name>firstElement</name></element>", am i right? And what if
> > > the parameter input include "<element
id="1"><name>firstElement</name>"
> > > (yes,missing end element), what will the function implementation do?
> > > I think it will throw an exception, am i right ?
> > >
> > > Thank you very much and expecting your reply :)
> > >
> > >
> >
> > > 2010/3/16 Michael Glavassevich <[email protected]>
> > > Hi Yin,
> > >
> > > parseWithContext() isn't quite that simple. The input to this method
> > > is an XML fragment which would generally not be a complete XML
document.
> > >
> > > The goal of the project is to implement support for parsing XML
> > > fragments. LSParser.parse(LSInput) is only capable of reading
> > > complete XML documents and so cannot be used to implement
> > parseWithContext().
> > >
> > >
> > > Thanks.
> > >
> > > Michael Glavassevich
> > > XML Parser Development
> > > IBM Toronto Lab
> > > E-mail: [email protected]
> > > E-mail: [email protected]
> >
> > > Yin Lei <[email protected]> wrote on 03/12/2010 09:28:12 AM:
> > >
> > >
> > > > Hi Michael,
> > > >
> > > > I want to hold the project "Asynchronous LSParser and
> > > > parseWithContext" as my GSoC 2010 proposal,and I have been
researching
> > > >
> > > > Xerces's code for a long time. I will tell you my idea about this
> > > > project and expecting for your advises.
> > > >
> > > > In Xerces-J 2.9.1's code,class
> > > > org.apache.xerces.parsers.DOMParserImpl implements interface
> > > > LSParser,but DOMParserImpl does not support asynchronous
> > > > mechanism,if we want to make DOMParserImpl support asynshronous
> > > > mechanism,we can achieve it by the following two steps:
> > > >
> > > > 1.DOMParserImpl should implements interface EventTarget
> > > >
> > > > Use a Vector ojbect (we name it repository) to store all the action

> > > > listeners registered in the current LSParser object.Each of
> > > > the action listener is made up of three parts,type,useCapture and
> > > > event handler function. And then,we need achieve function
> > > > addEventListener,dispatchEvent and removeEventListener.
> > > > addEventListener: just add a action listener object in to
repository
> > > > .We should notice that listener with the same parameters can
> > > >                           only be added once.
> > > > dispatchEvent:traverse each item of repository,if some one has the
> > > > same type value with the event and its useCapture value is
> > > >                          true,let's dispatch its handleEvent
function.
> > > > removeEventListener: traverse each item of repository,if some one
is
> > > > the same as the object in the parameter,just remove this
> > > >                            item from repository.
> > > >
> > > > 2.implement asynchronous mechanism
> > > >
> > > > When DOMParserImpl object's parse() function is dispatched,set busy

> > > > value true, start a Thread to parse XML document in
> > > > LSInput,and then return null value. When XML parse thread finish
its
> > > > parse job,set busy value false,create a LSLoadEvent object
> > > > with type value load,dispatch function dispatchEvent(Event evt).If
> > > > user register any actionlistener for load event,finish the
> > > > job defined in actionlistener's handleEvent function.
> > > >
> > > >
> > > > Implement function parseWithContext(LSInput input, Node contextArg,

> > > > short action)
> > > >
> > > > parseWithContext is a synchronous function (even if LSParser is in
> > > > asynchronous model). I think its implemention is simple (may
> > > > be because i am naive), we just use DOMParserImpl's parse(LSInput
> > > > input) function parse LSInput stream to a DOM tree, and merge this
> > > > DOM tree with Node contextArg according parameter action, then
> > > > return the result DOM tree.
> > > >
> > > > This is my mainly idea about this project,if you have any
> > > > ideas,please let me know,thank you!
> > > >
> > > > Best regards
> > >
> > > > 2010/3/11 Michael Glavassevich <[email protected]>
> > > > Hi Yin,
> > > >
> > > > Welcome to the mailing list. The LSParser project from the GSoC
2009
> > > > idea list is still available this year if you're interested.
> > > >
> > > > Thanks.
> > > >
> > > > Michael Glavassevich
> > > > XML Parser Development
> > > > IBM Toronto Lab
> > > > E-mail: [email protected]
> > > > E-mail: [email protected]
> > > >
> > > > Yin Lei <[email protected]> wrote on 03/09/2010 10:51:39 PM:
> > > >
> > > > > Hi guys,
> > > >
> > > > >
> > > > >  I am  interested about the project "Asynchronous LSParser and
> > > > > parseWithContext"  at Gsoc 2009 idea list.And i noticed that this

> > > > > project have not been done yet,so,i have been learning some
> > > > > background knowledge about this project, and want to hold this
> > > > > project as my GSoC 2010 proposal. But i am not sure whether this
> > > > > subject will remain this year or not.Can anyone tell me?
> > > > >
> > > > > Thank you & best regards

Reply via email to