Re: Reading structured text file (non-CSV) into Pandas Dataframe
On Thursday, April 13, 2017 at 11:09:23 AM UTC+1, David Shi wrote: > http://www.ebi.ac.uk/ena/data/warehouse/search?query=%22geo_circ(-0.587,-90.5713,170)%22=sequence_release=text > The above is a web link to a structured text file. It is not a CSV. > How can this text file be read into a Pandas Dataframe, so that further > processing can be made? > Looking forward to hearing from you. > Regards. > David http://pandas.pydata.org/pandas-docs/stable/io.html -- https://mail.python.org/mailman/listinfo/python-list
Reading structured text file (non-CSV) into Pandas Dataframe
http://www.ebi.ac.uk/ena/data/warehouse/search?query=%22geo_circ(-0.587,-90.5713,170)%22=sequence_release=text The above is a web link to a structured text file. It is not a CSV. How can this text file be read into a Pandas Dataframe, so that further processing can be made? Looking forward to hearing from you. Regards. David -- https://mail.python.org/mailman/listinfo/python-list
Question about Reading from text file with Python's Array class
I've been trying to use the Array class to read 32-bit integers from a file. There is one integer per line and the integers are stored as text. For problem specific reasons, I only am allowed to read 2 lines (2 32-bit integers) at a time. To test this, I made a small sample file (sillyNums.txt) as follows; 109 345 2 1234556 To read this file I created the following test script (trying to copy something I saw on Guido's blog - http://neopythonic.blogspot.com/2008/10/sorting-million-32-bit-integers-in-2mb.html ): import array assert array.array('i').itemsize == 4 bufferSize = 2 f=open(sillyNums.txt,r) data = array.array('i') data.fromstring(f.read(data.itemsize* bufferSize)) print data The output was nonsense: array('i', [171520049, 171258931]) I assume this has something to do with my incorrectly specifying how the various bit/bytes line up. Does anyone know if there's a simple explanation of how I can do this correctly, or why I can't do it at all? Thanks, Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Question about Reading from text file with Python's Array class
traveller3141 wrote: I've been trying to use the Array class to read 32-bit integers from a file. There is one integer per line and the integers are stored as text. For problem specific reasons, I only am allowed to read 2 lines (2 32-bit integers) at a time. To test this, I made a small sample file (sillyNums.txt) as follows; 109 345 2 1234556 These are numbers in ascii not in binary. To read these you don't have to use the array class: from itertools import islice data = [] with open(sillyNums.txt) as f: for line in islice(f, 2): data.append(int(line)) (If you insist on using an array you of course can) Binary denotes the numbers as they are stored in memory and typically used by lowlevel languages like C. A binary 32 bit integer always consists of 4 bytes. With your code To read this file I created the following test script (trying to copy something I saw on Guido's blog - http://neopythonic.blogspot.com/2008/10/sorting-million-32-bit-integers- in-2mb.html ): import array assert array.array('i').itemsize == 4 bufferSize = 2 f=open(sillyNums.txt,r) data = array.array('i') data.fromstring(f.read(data.itemsize* bufferSize)) print data The output was nonsense: array('i', [171520049, 171258931]) you are telling Python to interpret the first 8 bytes in the file as two integers. The first 4 bytes are 1, 0, 9, \n (newline) when interpreted as characters or 49, 48, 57, 10 when looking at the bytes' numerical values. A 32 bit integer is calculated with 49+48*2**8+57*2**16+10*2**24 171520049 Looks familiar... -- http://mail.python.org/mailman/listinfo/python-list
Re: Question about Reading from text file with Python's Array class
On 12/18/11 12:33, traveller3141 wrote: To test this, I made a small sample file (sillyNums.txt) as follows; 109 345 2 1234556 f=open(sillyNums.txt,r) data = array.array('i') data.fromstring(f.read(data.itemsize* bufferSize)) print data The output was nonsense: array('i', [171520049, 171258931]) I assume this has something to do with my incorrectly specifying how the various bit/bytes line up. Does anyone know if there's a simple explanation of how I can do this correctly, or why I can't do it at all? It reads the bytes directly as if using struct.unpack() as in data = '109\n345\n2\n123456' print ' '.join(hex(ord(c))[2:] for c in data) 31 30 39 a 33 34 35 a 32 a 31 32 33 34 35 36 0x0a393031 # 4 bytes 171520049 It sounds like you want something like from itertools import islice a = array.array('i') a.fromlist(map(int, islice(f, bufferSize))) which will read the first 2 lines of the file (using islice() on the file-object and slicing the first 2 items), and map the string representations into integers, then pass the resulting list of integer data to the array.array.fromlist() method. -tkc -- http://mail.python.org/mailman/listinfo/python-list
reading a text file
hi clp what's the difference between: while True: input_line = sys.stdin.readline() if input_line: sys.stdout.write(input_line.upper()) else: break and: while True: try: sys.stdout.write(sys.stdin.next().upper()) except StopIteration: break ??? -- http://mail.python.org/mailman/listinfo/python-list
Re: reading a text file
superpollo u...@example.net writes: while True: try: sys.stdout.write(sys.stdin.next().upper()) except StopIteration: break Maybe there is some subtle difference, but it looks like you really mean for line in sys.stdin: sys.stdout.write(line.upper()) -- http://mail.python.org/mailman/listinfo/python-list
Re: reading a text file
superpollo wrote: hi clp what's the difference between: while True: input_line = sys.stdin.readline() if input_line: sys.stdout.write(input_line.upper()) else: break and: while True: try: sys.stdout.write(sys.stdin.next().upper()) except StopIteration: break You should write the latter as for line in sys.stdin: sys.stdout.write(line.upper()) or sys.stdout.writelines(line.upper() for line in sys.stdin) You seem to know already that next() and readline() use different ways to signal I'm done with the file. Also, after the first StopIteration subsequent next() calls are guaranteed to raise a StopIteration. But the main difference is that file.next() uses an internal buffer, file.readline() doesn't. That means you would lose data if you tried to replace the readline() call below with next() first_line = f.readline() read_of_the_file = f.read() In newer Python versions you will get a ValueError when mixing next() and read()/readline() but in older Pythons (before 2.5 I think) you are in for a surprise. As for line in file: ... is both the fastest and most readable approach if you want to access a file one line at a time I recommend that you use it unless there is a specific reason not to. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: reading a text file
superpollo wrote: hi clp what's the difference between: while True: input_line = sys.stdin.readline() if input_line: sys.stdout.write(input_line.upper()) else: break and: while True: try: sys.stdout.write(sys.stdin.next().upper()) except StopIteration: break More useless code, under the hood its working similar. But why not use it in the way intended? for input_line in sys.stdin: sys.stdout.write(input_line.upper()) ? Regards Tino smime.p7s Description: S/MIME Cryptographic Signature -- http://mail.python.org/mailman/listinfo/python-list
Reading from text file
I want to read from text file, 25 lines each time i press enter key, just like the python documentation. i`m using Python 2.5, Windows XP Thank you -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading from text file
A. Joseph wrote: I want to read from text file, 25 lines each time i press enter key, just like the python documentation. you can use pydoc's pager from your program: import pydoc text = open(filename).read() pydoc.pager(text) /F -- http://mail.python.org/mailman/listinfo/python-list
Re: reading hebrew text file
realy thanks hagai -- http://mail.python.org/mailman/listinfo/python-list
reading hebrew text file
I have a hebrew text file, which I want to read in python I don't know which encoding I need to use how I do that thanks, hagai -- http://mail.python.org/mailman/listinfo/python-list
Re: reading hebrew text file
[EMAIL PROTECTED] wrote: I have a hebrew text file, which I want to read in python I don't know which encoding I need to use how I do that As for the how, look to the codecs module -- but if you don't know what codec the textfile is written in, I know of no ways to guess from here!-) Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: reading hebrew text file
I looked for VAV in the files in the encodings directory (/usr/lib/python2.4/encodings/*.py on my machine). I found that the following character encodings seem to include hebrew characters: cp1255 cp424 cp856 cp862 iso8859-8 A file containing hebrew text might be in any one of these encodings, or any unicode-based encoding. To open an encoded file for reading, use f = codecs.open(file, 'r', encoding='...') Now, calls like 'f.readline()' will return unicode strings. Here's an example, using a file in UTF-8 I have laying around: f = codecs.open(/users/jepler/txt/UTF-8-demo.txt, r, utf-8) for i in range(5): print repr(f.readline()) ... u'UTF-8 encoded sample plain-text file\n' u'\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\n' u'\n' u'Markus Kuhn [\u02c8ma\u02b3k\u028as ku\u02d0n] [EMAIL PROTECTED] \u2014 1999-08-20\n' u'\n' Jeff pgpIIx2zTStwL.pgp Description: PGP signature -- http://mail.python.org/mailman/listinfo/python-list
Re: reading hebrew text file
[EMAIL PROTECTED] wrote: I have a hebrew text file, which I want to read in python I don't know which encoding I need to use that's not a good start. but maybe it's one of these: http://sites.huji.ac.il/tex/hebtex_fontsrep.html ? how I do that f = open(myfile) text = f.readline() followed by one of text = text.decode(iso-8859-8) text = text.decode(cp1255) text = text.decode(cp862) alternatively, use: f = codecs.open(myfile, r, encoding) to get a stream that decodes things on the fly. /F -- http://mail.python.org/mailman/listinfo/python-list