大哥,没事你在我的邮件里面瞎参活儿啥儿?有空好好弄你的项目吧~不要浪费我们导师的时间了~
在 2010年4月1日 上午10:38,xunlong gui <[email protected]>写道: > Hi michael, > Thank you for your advise, i will fix it :-) > > 在 2010年3月31日 下午10:53,Michael Glavassevich <[email protected]>写道: > > Hi Yin, >> >> I noticed you completely removed the section on community involvement in >> your most recent draft. In my previous note I wasn't suggesting that you >> delete it altogether, just that you describe your unique experience in your >> own words. >> >> Engagement with the project community is important and it's a good idea >> for everyone to mention it in their proposal. >> >> >> Thanks. >> >> Michael Glavassevich >> XML Parser Development >> IBM Toronto Lab >> E-mail: [email protected] >> E-mail: [email protected] >> >> "xiaohei.leiyin" <[email protected]> wrote on 03/30/2010 10:19:16 >> AM: >> >> > Hi Michael, >> >> > >> > Thank you so much, i think i have found the solution to implement >> > Asynchronous LSParser and parseWithContext for Xerces, and i am sure >> > i can finish it well, the most important thing is that i am >> > interested in XML parsing job,that gives me power. I am really >> > really looking forward to one of Xerces committers :-) >> > >> > In addition, here is my submited proposal,if you have time, i am >> > looking forward to any suggestions from you >> > >> ------------------------------------------------------------------------------------------------------------------------------------- >> > Project Title:Implement Xerces' Asynchronous LSParser and >> parseWithContext >> > StudentName: Yin Lei >> > Student Email: [email protected] >> > Organization/Project:Apache Foundation/Xerces >> > >> > Assigned Mentor:Michael Glavassevich >> > Proposal Abstract: >> > Apache Xerces2 is a powful XML parser,at present, it implements a >> > collection of standard APIs for XML processing,though Xerces has a >> > functional DOM Level 3 LSParser,but there are a couple parts of the >> > spec which still need to be implemented.This project will provide an >> > asynchronous version for LSParse which returns from the parse method >> > immediately and builds the DOM tree on another thread as well as >> > implementing the function parseWithContext which allows a document >> > fragment to be parsed and attached to an existing DOM. >> > Detailed Description: >> > Apache Xerces-J is a high-performance, standard complaint processor >> > written in Java for parsing, validating, serializing and >> > manipulating XML documents. It provides a complete implementation of >> > the Document Object Model Level 3 Core and Document Object Model >> > Level 3 Load and Save Recommendations,but Xerces' implemention of >> > LSParser has two limitations( >> http://xerces.apache.org/xerces2-j/dom3.html): >> > 1. not support asynchronous LSParser which returns from the parse >> > method immediately and builds the DOM tree on another thread. >> >> > 2. not support the function parserWithContext of interface LSParser >> > which parse an XML fragment from a resource identified by a LSInput >> > and insert the content into an existing document at the position >> > specified with the context and action arguments. >> >> > In order to solve these two limitations, i have been researching >> > W3C's recommendation specification about LSParser and in the >> > meantime, i have downloaded Xerces2-J's source code,import it to my >> > Eclipse workspace, look it over and over and consider how to >> > implements these two specifications.At the same time,i discuss the >> > subject with Xerces' developers(You help me a lot,thank >> > you,especially dear Michael Glavassevich).Now,i have found some >> > ideas about the solution and did some experiments to check my >> > solution,this is only a global solution,and i neglect some details. >> > 1. interface DOMImplementationLS,Class >> > org.apache.xerces.dom.CoreDOMImplementationImpl implements the >> > interface. As described in W3C's recommendation, >> > DOMImplementationLS's implemention should supply a function >> > createLSParser which can create synchronous LSParser as well as >> > asynchronous LSParser,but now, we can only get the former using >> > CoreDOMImplementationImpl's function createLSParser. So,i should fix >> > this problem. >> >> > 2. interface LSParser,Class org.apache.xerces.parsers.DOMParserImpl >> > implements the interface, but absolutely,it supports synchronous >> > model only,even the function getAsync in it directly return false. >> > There is my solution to provide an asynchronous version for LSParser. >> >> > Step one : DOMParserImpl implements interface EventTarget as well as >> > interface LSParser. >> > It use a Vector ojbect (we name it repository) to store all the >> > action listeners registered in to the current LSParser object. Each >> > of listeners is made up of three parts,type,useCapture and event >> > handler function,there are only two types of event,load and >> > progress. My following task is to implement function >> > addEventListener,dispatchEvent and removeEventListener. >> > addEventListener : just add a action listener object in to >> > repository.We should notice that listener with the same parameters >> > can only be added once. >> > dispatchEvent : traverse each item of repository,if some one has the >> > same type value with the event and its useCapture value is >> > true,let's dispatch its handleEvent function. >> > removeEventListener : traverse each item of repository,if some one >> > is the same as the object in the parameter,just remove this item >> > from repository. >> > Step two : implement interface LSLoadEventIn asynchronous LSParser, >> > LSLoadEvent is used to inform the parser that the parse function has >> > finished parse job. We can achieve it by dispatching LSParser's >> > dispatchEvent function which will receive LSLoadEvent as a parameter. >> > Step three : implement interface LSProgressEventIn asynchronous >> > LSParser,the parse thread will trigger a LSProgressEvent when it >> > finish a entity node parsing job,the triggered LSProgressEvent will >> > tell LSParser current parse position. If it can see more external >> > resource reference, it may also change totalSize value. >> > Step four : implement asynchronous mechanismDOMParserImpl has a >> > attribute which mark its model,synchronous or asynchronous. We can >> > get its parse model from the function getAsynoc. If the parser is in >> > asynchronous, when LSParser instance's parse() function is >> > dispatched,set busy value true, start a Thread to parse XML document >> > in LSInput,and then return null value. When XML parse thread finish >> > its parse job,set busy value false,create a LSLoadEvent instance >> > with type value load,dispatch function dispatchEvent(Event evt).If >> > user register any actionlistener for load event,dispatchEvent >> > function will finish jobs defined in actionlistener's handleEvent >> function. >> > 3. function parseWithContext(LSInput input, Node contextArg, short >> action) >> >> > This function parse an XML fragment from a resource identified by a >> > LSInput and insert the content into an existing document at the >> > position specified with the context and action arguments. This XML >> > fragment is a special data structure, I need contruct a new class >> > named XMLFragment to store it. Then, i should do the following jobs: >> > Parse the XML fragment into a XMLFragment object,mark it whether a >> > complete XML document, any error happens,throw an exception. These >> > classes can help me: >> > a . org.apache.xerces.impl.XMLDocumentFragmentScannerImpl >> > b . >> org.apache.xerces.impl.XMLDocumentScannerImpl.FragmentContentDispatcher >> > c . org.apache.xerces.impl.XMLEntityScanner >> > I can use some functions in these classes and start the parsing job >> > by consult function startEntity in class >> > org.apache.xerces.impl.XMLDocumentScannerImpl and function >> > scanDocument in class >> > org.apache.xerces.impl.XMLDocumentFragmentScannerImpl. Here is the >> > basic implement idea (in fact, this is a recursion process): >> > 1. Create a XMLFragment instance,there is a very importmant >> > attribute in it, we named it fCurrentNode; >> >> > 2. Start read characters from the LSInput stream.If catch Node start >> >> > character such as "<" from the input stream,go tostep 3; If catch >> > Node end character such as "/" from the input stream,go to step 4; >> > If catch file end up character,go to step 5. >> >> > 3. A begin EVENT happens, usually, instance a Node object,append the >> >> > Node instance in to fCurrentNode's child node list,change >> > fCurrentNode to this Node,then go to step 2. >> >> > 4. A end EVENT happens,usually,we should change fCurrentNode to its >> > father node,then go to step 2. >> >> > 5. Parsing job ends. >> > Start add the XMLFragment in to the place indicated by the parameter >> > action. In this phase, we have lots of validate jobs to do, >> > including four aspects: Basic Validation, Namespace Validate, DTD >> > Validation and Schema Validateoin. >> > 1. Basic Validation: >> > I should validate whether this merge job is legal, for example,if >> > the context Node is document root element, and the parameter action >> > is not ACTION_REPLACE_CHILDREN,in this situation,an error should be >> thrown up. >> > I should confirm the merge result XML document is well-formed,for >> > example, the DOM should have only one root element and Entity >> > declaration must be at the beginning part of the document etc. >> > 2. Namespace Validation: I should validate both Element namespace >> > and Attribute namespace of the merge result XML document >> >> > 3. DTD Validationa: >> > Validate whether the merge result XML document is in keeping with >> > Element Type Declarationb. >> > Validate whether the merge result XML document is in keeping with >> > Entity Declarationc. >> > Validate whether the merge result XML document is in keeping with >> > Attribute Declaration, for example, if DTD file includes default >> > attribute declaration,i should add default attributes for the >> > elements which are root in LSInput Fragment. >> > 4. Schema Validation,This section includes validations demands in >> > DTD Validation, and it has some more validation requests: >> >> > Validate data type of elements and attributes >> > Three kinds of annotation declaration validation >> > If everything is OK,return the result Node,otherwise if an error >> > occurs, the caller is notified through the ErrorHandler instance >> > associated with the "error-handler" parameter of the >> > DOMConfiguration.As the new data is inserted into the document, at >> > least one mutation event is fired per new immediate child or sibling >> > of the context node. >> > Additional Information: >> > My development plan: >> > >> > 1st week in 1st month(May 24 - Jun 1) >> > >> > Read Xerces-J source code and get familiar with its >> > architecture,thus what I have done will comply with its philosophy >> > >> > 2st week in 1st month(Jun 1 - Jun 8) >> > >> > Do some change job to DOMImplementationLS and DOMParserImpl,make >> > DOMImplementationLS can create asynchronous LSParser and add some >> > basic attribute for DOMParserImpl such as asynchronous flag and so on >> > >> > 3st week in 1st month(Jun 9 - Jun 16) >> > >> > Construct DOMParserImpl's structure to implement interface >> > EventTarget,implement addActionListener,dispatchEvent and >> > removeActionListener >> > >> > 4st week in 1st month(Jun 17 - Jun 24) >> > >> > Implement LSParser parse() and parseURI() function, add >> > asynochronous support implement LSParser function abort() implement >> > LSParser function getAsync() implement LSParser function getBusy() >> > >> > 1st week in 2st month(Jun 25 - Jul 2) >> > >> > Implement interface LSLoadEvent and LSProgressEvent,finish the whole >> > asynchronous parse cycle and some unit test >> > >> > 2st week in 2st month(Jul 3- Jul 10) >> > >> > finish sub task of function parseWithContext() -- parse the LSInput >> > into a XMLFragment instance >> > >> > 3st week in 2st month(Jul 11- Jul 18) >> > >> > start merge context Node and XML fragment document,finish Basic >> > Validation and Namespace Validation >> > >> > 4st week in 2st month(Jul 19- Jul 26) >> > >> > finish the merge job of context DOM tree and the XMLFragment,finish >> > DTD Validation and Schema Validation >> > >> > 1st week in 3rd month(Jul 27- Aug 3) >> > >> > Test My asynchronous LSParser and function parseWithContext >> > >> > last 2 weeks in 3rd month(Aug 3 - Aug 20) >> > >> > submit all codes and documents >> > >> > Who i am ? >> > Hi,everyone,My name is Yin Lei. I am a postgraduate student of >> > University of Science and Technology Beijing,China. My major is >> > computer scienece and technology. During my six years Java >> > development experience, Apache help me so much, many projects such >> > as Struts,Tomcat,Xerces,Xalan,HttpClient,Common >> > FileUpload,JavaMail,POI play important part of my research projects. >> > So, i am eager to participate in open source community and become a >> > long term commiter of that project, in my daily work, i use Xerces >> > as my XML parser, so, i found its lacking and want to improve it to >> > make it perfect :-) >> > My work experience and relative rewards: >> > 2007.7 - 2008.5 : work in SUN Microsystem Inc. as a intern >> > 2008.7 - 2009.12 : work in IBM China Development Laborary as a intern >> > 2008.9 : won excellent team member of 2008 IBM blue pathway program >> > 2009.11: won Lotus Innovation Award of IBM Asia Pacific >> > Also,i did some open source job before,the first experience I had in >> > open source development is building a Eclipse plugin for Apache >> > SCXML engine, and also attempt to add a new feature for SCXML engine >> > to make it support multi-thread operation.I can code in C++, Java >> > and some script language such as JaveScript and ActionScript. In >> > addition to these things, I'm familiar with XML,DOM,SAX,JDOM and >> > Dom4j,I want to improve existing XML parsing tools through my job. >> > >> > 2010-03-30 >> > >> > xiaohei.leiyin >> > >> > 发件人: Michael Glavassevich >> > 发送时间: 2010-03-30 20:10:21 >> > 收件人: xiaohei.leiyin >> > 抄送: >> > 主题: Re: GSoC proposal about "Asynchronous LSParser and parseWithContext >> " >> > Hi Yin, >> > >> > Yes, that's fine. If your proposal is accepted for GSoC I would >> > mentor you and I think that's what they're looking for there on the GSoC >> site. >> > >> > There are usually several hundred proposals submitted to Apache >> > every year for the various projects across the organization. It can >> > be very competitive depending on the number of spots that Google >> > actually awards to Apache and the number of good proposals submitted >> > by students. I wish you good luck in the selection process. >> > >> > Thanks. >> > >> > Michael Glavassevich >> > XML Parser Development >> > IBM Toronto Lab >> > E-mail: [email protected] >> > E-mail: [email protected] >> > >> > "xiaohei.leiyin" <[email protected]> wrote on 03/30/2010 >> 01:02:31 AM: >> > >> > > Dear Michael, >> > > >> > > I have modify my proposal follow your advise, and submit it in the >> > > GSoC web site, i noticed that there is a item "Organization/ >> > > Project:Assigned Mentor:" in the content section of the proposal >> > > submit page. So, can i fill it "Organization/Project:Apache >> > > Foundation/Xerces Assigned Mentor:Michael Glavassevich", is it ok ? >> > > I mean that can i take you as my assigned mentor ? If you think it >> > > is ok, i will maintain, if you do not like it due to some >> > > reasons,please let me know, i will alter it ( it is ok, i must >> > > respect you, in Chinese culture,you are my teacher already,respct >> > > teach is a Chinese culture of long standing and well established, we >> > > call it 尊师重教[zun shi zhong jiao] ). >> > > >> > > During these days,when i was researching Xerces' architect and >> > > discovering how to implement Asynchronous LSParser and >> > > parseWithContext for Xerces, i found i got lots of knowledge, made a >> > > full-grown progress. When i came across some difficulties, you >> > > helped me a lot, in fact ,you are my mentor in my heart. Thank you >> > > so so so so much ! I think i have won knowledge no matter GSoC >> > > receive my proposal or not, i will finish this project, once i began >> > > it, i want to finish it, for you, for open source. >> > > >> > > Your student : Yin Lei from China >> > > >> > > 2010-03-30 >> > > >> > > xiaohei.leiyin >> >> >
