Hi all,

I am new to Python and was wondering if I can get some help with my short 
script. What I would like the script to do is:
(1) Read the tab delimited file generated by Refworks
(2) Output exactly the same file but the blank column added in front.
(This is for prepping the exported tab delimited file from refworks so that it 
can be imported into MySQL; so any suggestions in the line of timtoady would be 
also appreciated.)

This is what I have so far. It works, but then in the output file, I end up 
getting some weird character in each line in the second column (first column in 
the original input file). I also don't really get what escapechar=' ' does or 
what I am supposed to put in there.

import csv
with open('noid_refworks.txt','rU') as csvinput:
    with open('withid.txt', 'w') as csvoutput:
        dialect = csv.Sniffer().sniff(csvinput.read(1024))
        csvinput.seek(0)
        reader = csv.reader(csvinput, dialect)
        writer = csv.writer(csvoutput, dialect, escapechar='\'', 
quoting=csv.QUOTE_NONE)
        for row in reader:
            writer.writerow(['\t']+row)

A row in the original file is like this (Tab delimited and no quotations, some 
fields have commas and quotation marks inside.):

Reference Type    Authors, Primary    Title Primary    Periodical Full    
Periodical Abbrev    Pub Year    Pub Date Free From    Volume    Issue    Start 
Page    Other Pages    Keywords    Abstract    Notes    Personal Notes    
Authors, Secondary    Title Secondary    Edition    Publisher    Place Of 
Publication    Authors, Tertiary    Authors, Quaternary    Authors, Quinary    
Title, Tertiary    ISSN/ISBN    Availability    Author/Address    Accession 
Number    Language    Classification    Sub file/Database    Original Foreign 
Title    Links    DOI    Call Number    Database    Data Source    Identifying 
Phrase    Retrieved Date    Shortened Title    User 1    User 2    User 3    
User 4    User 5    User 6    User 7    User 8    User 9    User 10    User 11  
  User 12    User 13    User 14    User 15

A row in the output file is like this:
(The tab is successfully inserted. But I don't get why I have L inserted after 
no matter what I put in escapechar)

    LReference Type    Authors, Primary    Title Primary    Periodical Full    
Periodical Abbrev    Pub Year    Pub Date Free From    Volume    Issue    Start 
Page    Other Pages    Keywords    Abstract    Notes    Personal Notes    
Authors, Secondary    Title Secondary    Edition    Publisher    Place Of 
Publication    Authors, Tertiary    Authors, Quaternary    Authors, Quinary    
Title, Tertiary    ISSN/ISBN    Availability    Author/Address    Accession 
Number    Language    Classification    Sub file/Database    Original Foreign 
Title    Links    DOI    Call Number    Database    Data Source    Identifying 
Phrase    Retrieved Date    Shortened Title    User 1    User 2    User 3    
User 4    User 5    User 6    User 7    User 8    User 9    User 10    User 11  
  User 12    User 13    User 14    User 15


Any help or pointers would be greatly appreciated!
~Bohyun

Reply via email to