Le 12 mai 2022 20:44:22 GMT+01:00, "Hammer, Erich F" a écrit
:
>Danielle,
>
>.DOCX files are just a collection of zipped xml and image files. You can see
>this by changing the extension (on a copy) on the file and then exploring. It
>should be possible to parse out the data from the XML
There are several Citation parsers available, you may try exploring which
one works best for you. I am listing some of them, as per my knowledge and
experience:
1. Anystyle.io
2. GROBID
3. Excite
4. Outside
5. biblio-glutton
6. CERMINE
Hope it helps.
On Fri, May 13, 2022 at
And for going beyond the bibliographic citations to include abstracts as
well, https://grobid.readthedocs.io/en/latest/ might be useful. --Kevin
On 5/12/22 1:49 PM, Julia Bauder wrote:
Hi, Danielle,
Have you taken a look at https://text2bib.economics.utoronto.ca/ ? If it
works for you,
Danielle,
.DOCX files are just a collection of zipped xml and image files. You can see
this by changing the extension (on a copy) on the file and then exploring. It
should be possible to parse out the data from the XML file(s) and build a
structure from it.
Erich
On Thursday, May 12, 2022
Let’s try this again without my hitting ‘send’ when I want to send it to
drafts. (Yay, mystery meat navigation in cell phone interfaces)
>> On May 12, 2022, at 2:40 PM, Danielle Reay wrote:
>>
>> Hello,
>>
>> We have a faculty member looking to create a dataset from an annotated
>>
>
> On May 12, 2022, at 2:40 PM, Danielle Reay wrote:
>
> Hello,
>
> We have a faculty member looking to create a dataset from an annotated
> bibliography she compiled. Right now it exists as a word file and as a pdf.
> The entries are relatively structured with a citation and an abstract,
Danielle Reay wrote:
Hello,
We have a faculty member looking to create a dataset from an annotated
bibliography she compiled. Right now it exists as a word file and as a pdf.
The entries are relatively structured with a citation and an abstract, but
the document is about 150 pages long with
Hi, Danielle,
Have you taken a look at https://text2bib.economics.utoronto.ca/ ? If it
works for you, that's likely to be one of the easiest methods to convert
the list into structured data.
Best,
Julia
_
Julia Bauder
Social Studies and Data
Hello,
We have a faculty member looking to create a dataset from an annotated
bibliography she compiled. Right now it exists as a word file and as a pdf.
The entries are relatively structured with a citation and an abstract, but
the document is about 150 pages long with multiple entries per page.