From what I can see, you are only passing volume.xml to your parser.
If I understand your code and questions correctly, the Volume file
simply points to the actual articles that you want to parse. Seems like
you need to parse the Volume file, get the name/location of the article
file and then parse that for entry in Lucene. Or am I mistaken? What
does the volume file look like?
Malcolm wrote:
It's XML like this. It has 120-ish volumes with references to 12,107 articles
which are like this below:
<article>
<fno>A1003</fno>
<doi>10.1041/A1003s-1995</doi>
<fm><hdr><hdr1><ti>IEEE Annals of the History of Computing</ti>
<crt><issn>1058-6180</issn>/95/$4.00 <cci><onm>© 1995
IEEE</onm></cci></crt></hdr1>
<hdr2><obi><volno>Vol. 17</volno>, <issno>No. 1</issno></obi>
<pdt><mo>Spring</mo><yr>1995</yr></pdt>
<pp>pp. 3-3</pp></hdr2></hdr>
<tig>
<atl>About this Issue</atl><pn>pp. 3-3</pn></tig>
<au
sequence="first"><fnm>J.A.N.</fnm><snm>Lee</snm><role>Editor‐in‐Chief</role></au>
</fm>
<bdy>
<p>The first issue of our 17th volume is as diverse in topics as any nontheme issue that we
have tried to present over the past many years. However, it still represents the work of the
English‐speaking world of the North Atlantic rather than a broader picture of computing in
the whole world. The Editorial Board and the article editors of the <it>Annals</it>
are doing their best to bring the history of the whole world of computing to our readers,
but it does require authors in other countries to offer their manuscripts for our
consideration. Please take this as an open invitation to authors in other parts of the
world to submit papers to the <it>Annals</it>
for review and help us to follow the lead of our parent organization in being the “The
World’s Computer Society.”</p>
<p>The five major articles in this issue represent several manuscripts that have been in our files for some time,
and we are grateful to the authors for having “stuck with us” while we reviewed,
re‐reviewed, and reworked their papers. Articles in the field of history do not always present the work of
the authors themselves (though we welcome pioneers to give us their own stories, as in the case of the 1935 article by
John McPherson in this issue); thus, answering the question “is it accurate?” is not always easy.
In fact, we ask our referees to answer the following questions about each manuscript, and their responses determine
whether we accept the manuscript “as is” or whether we ask the author(s) to revise the
material:</p>
<l2>
<li>
<p>Are the issues addressed in the paper stated clearly enough?</p></li>
<li>
</bdy></article>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]