James Stroud wrote:
I included code in my previous post that will parse the entire bib,
making use of the numbering and eliminating the most probable, but still
fairly rare, potential ambiguity. You might want to check out that code,
as my testing it showed that it worked with your
James Stroud wrote:
import re
records = []
record = None
counter = 1
regex = re.compile(r'^(\d+)\. (.*)')
for aline in lines:
m = regex.search(aline)
if m is not None:
recnum, aline = m.groups()
if int(recnum) == counter:
if record is not None:
Necmettin Begiter wrote:
Is this how the text looks like:
123
some information
124 some other information
126(tab here)something else
If this is the case (the numbers are at the beginning, and after the numbers
there is either a newline or a tab, the logic might be this simple:
Dave Hansen wrote:
Questions:
1) Do the citation numbers always begin in column 1?
Yes, that's one consistency at least. :)
2) Are the citation numbers always followed by a period and then at
least one whitespace character?
Yes, it seems to be either one or two whitespaces.
find the
James Stroud wrote:
If you can count on the person not skipping any numbers in the
citations, you can take an AI approach to hopefully weed out the rare
circumstance that a number followed by a period starts a line in the
middle of the citation.
I don't think any numbers are skipped, but
John Salerno wrote:
So I need to remove the line breaks too, but of course not *all* of them
because each reference still needs a line break between it.
After doing a bit of search and replace for tabs with my text editor, I
think I've narrowed down the problem to just this:
I need to remove
John Salerno wrote:
John Salerno wrote:
So I need to remove the line breaks too, but of course not *all* of
them because each reference still needs a line break between it.
After doing a bit of search and replace for tabs with my text editor, I
think I've narrowed down the problem to
John Salerno wrote:
typed, there are often line breaks at the end of each line
Also, there are sometimes tabs used to indent the subsequent lines of
citation, but I assume with that I can just replace the tab with a space.
--
http://mail.python.org/mailman/listinfo/python-list
In [EMAIL PROTECTED], John Salerno wrote:
I have a large list of publication citations that are numbered. The
numbers are simply typed in with the rest of the text. What I want to do
is remove the numbers and then put bullets instead. Now, this alone
would be easy enough, with a little
Marc 'BlackJack' Rintsch wrote:
I think I have vague idea how the input looks like, but it would be
helpful if you show some example input and wanted output.
Good idea. Here's what it looks like now:
1. Levy, S.B. (1964) Isologous interference with ultraviolet and X-ray
irradiated
On Tuesday 08 May 2007 22:23:31 John Salerno wrote:
John Salerno wrote:
typed, there are often line breaks at the end of each line
Also, there are sometimes tabs used to indent the subsequent lines of
citation, but I assume with that I can just replace the tab with a space.
Is this how the
On May 8, 3:00 pm, John Salerno [EMAIL PROTECTED] wrote:
Marc 'BlackJack' Rintsch wrote:
I think I have vague idea how the input looks like, but it would be
helpful if you show some example input and wanted output.
Good idea. Here's what it looks like now:
1. Levy, S.B. (1964) Isologous
John Salerno wrote:
Marc 'BlackJack' Rintsch wrote:
Here's what it looks like now:
1. Levy, S.B. (1964) Isologous interference with ultraviolet and X-ray
irradiated
bacteriophage T2. J. Bacteriol. 87:1330-1338.
2. Levy, S.B. and T. Watanabe (1966) Mepacrine and transfer of R
13 matches
Mail list logo