Re: Read lines from very large text file

2009-02-08 Thread Michael Ash
On Sun, Feb 8, 2009 at 2:44 AM, Joar Wingfors j...@joar.com wrote: On Feb 7, 2009, at 7:13 PM, Michael Ash wrote: What's wrong is that they won't allow you to specify the text encoding to use. The same thing is true for the *deprecated* method +stringWithCString: by the way. That is

Re: Read lines from very large text file

2009-02-07 Thread Michael Ash
On Sat, Feb 7, 2009 at 12:39 AM, Clark Cox clarkc...@gmail.com wrote: Even if you delete the file from the filesystem, you are just deleting the mapping from that particular filename to the file's actual data. The actual file still there until the last process with an open handle closes it, so

Re: Read lines from very large text file

2009-02-07 Thread René v Amerongen
Thus the warning: if a file disappears while you have it memory mapped, and you try to access it, you will crash. Does this mean that we should check every time the existence of the file before we try to read anything from the memory mapped file? RvA

Re: Read lines from very large text file

2009-02-07 Thread Michael Ash
On Sat, Feb 7, 2009 at 7:57 AM, René v Amerongen apple...@xs4all.nl wrote: Thus the warning: if a file disappears while you have it memory mapped, and you try to access it, you will crash. Does this mean that we should check every time the existence of the file before we try to read

Re: Read lines from very large text file

2009-02-07 Thread Steve Sisak
At 1:38 PM +1100 2/3/09, Jacob Rhoden wrote: On 3/02/2009 8:41 AM, Kenneth Bruno II wrote: I am wondering what the best way to read a text file, line by line, when the file size is much larger than available memory. For very large files you probably want to use NSFileHandle. With the method

Re: Read lines from very large text file

2009-02-07 Thread Steve Sisak
At 9:46 AM -0800 2/7/09, Joar Wingfors wrote: On Feb 7, 2009, at 6:55 AM, Steve Sisak wrote: Umm, unless I'm totally missing something, what's wrong with fopen() and fgets(), possibly followed with [NSString stringWithCString] on each line? What's wrong is that they won't allow you to

Re: Read lines from very large text file

2009-02-07 Thread Clark Cox
On Sat, Feb 7, 2009 at 10:27 AM, Steve Sisak sgs-li...@codewell.com wrote: At 9:46 AM -0800 2/7/09, Joar Wingfors wrote: On Feb 7, 2009, at 6:55 AM, Steve Sisak wrote: Umm, unless I'm totally missing something, what's wrong with fopen() and fgets(), possibly followed with [NSString

Re: Read lines from very large text file

2009-02-07 Thread Michael Ash
On Sat, Feb 7, 2009 at 12:46 PM, Joar Wingfors j...@joar.com wrote: On Feb 7, 2009, at 6:55 AM, Steve Sisak wrote: Umm, unless I'm totally missing something, what's wrong with fopen() and fgets(), possibly followed with [NSString stringWithCString] on each line? What's wrong is that they

Re: Read lines from very large text file

2009-02-07 Thread Joar Wingfors
On Feb 7, 2009, at 7:13 PM, Michael Ash wrote: What's wrong is that they won't allow you to specify the text encoding to use. The same thing is true for the *deprecated* method +stringWithCString: by the way. That is incorrect. I don't think that what I said is incorrect, at least not

Re: Read lines from very large text file

2009-02-06 Thread Matt Neuburg
On Tue, 03 Feb 2009 00:42:07 +1100, Jacob Rhoden li...@jacobrhoden.com said: I am wondering what the best way to read a text file, line by line, when the file size is much larger than available memory. Would there be a way to do this with dataWithContentsOfMappedFile? I've long wondered about

Re: Read lines from very large text file

2009-02-06 Thread Michael Ash
On Fri, Feb 6, 2009 at 6:44 PM, Matt Neuburg m...@tidbits.com wrote: On Tue, 03 Feb 2009 00:42:07 +1100, Jacob Rhoden li...@jacobrhoden.com said: I am wondering what the best way to read a text file, line by line, when the file size is much larger than available memory. Would there be a way to

Re: Read lines from very large text file

2009-02-06 Thread Sean McBride
Michael Ash (michael@gmail.com) on 2009-02-06 9:20 PM said: Would there be a way to do this with dataWithContentsOfMappedFile? I've long wondered about that... m. Yes and no. +dataWithContentsOfMappedFile: can be used to do this kind of efficient parsing, as memory mapping of files means

Re: Read lines from very large text file

2009-02-06 Thread Joar Wingfors
On Feb 6, 2009, at 8:52 PM, Sean McBride wrote: How can you guarantee a file's existence? sudo rm -f? How about calling open() on it? j o a r ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or

Re: Read lines from very large text file

2009-02-06 Thread Clark Cox
On Fri, Feb 6, 2009 at 9:30 PM, Sean McBride cwat...@cam.org wrote: Joar Wingfors (j...@joar.com) on 2009-02-06 12:06 AM said: How can you guarantee a file's existence? sudo rm -f? How about calling open() on it? :) But note the latter part of the sentence: this method should only be used

Re: Read lines from very large text file

2009-02-06 Thread Chris Ridd
On 7 Feb 2009, at 05:39, Clark Cox wrote: On Fri, Feb 6, 2009 at 9:30 PM, Sean McBride cwat...@cam.org wrote: Joar Wingfors (j...@joar.com) on 2009-02-06 12:06 AM said: How can you guarantee a file's existence? sudo rm -f? How about calling open() on it? :) But note the latter part of

Re: Read lines from very large text file

2009-02-03 Thread Jacob Rhoden
On 3/2/09 4:55 PM, Michael Ash wrote: Everything I've seen in this thread so far skimps on one important detail: If you're just looking at the raw data, how do you know how to interpret it? It hasn't been addressed because it's not really relevant to the question at hand. Yes, you

Re: Read lines from very large text file

2009-02-03 Thread Alexander Spohr
Am 03.02.2009 um 10:46 schrieb Jacob Rhoden: On 3/2/09 4:55 PM, Michael Ash wrote: It is not uncommon that I might have to deal with server logs that go into the gigabytes. Most logs (apache, squid, etc...) are all ascii encoded. The line ending is irrelevant, see a \n or a \r and we

Re: Read lines from very large text file

2009-02-03 Thread glenn andreas
On Feb 2, 2009, at 11:25 PM, Seth Willits wrote: On Feb 2, 2009, at 7:50 PM, Joar Wingfors wrote: Before opening the file, either determine, guess, or be told what the encoding is. With that encoding, convert your delimiter string into raw bytes, then do byte-for-byte comparison on the

Re: Read lines from very large text file

2009-02-03 Thread Joar Wingfors
On Feb 2, 2009, at 9:55 PM, Michael Ash wrote: It hasn't been addressed because it's not really relevant to the question at hand. Yes, you definitely need to either know or be able to discover the text encoding of the text files you're dealing with. But aside from both being about text files,

Re: Read lines from very large text file

2009-02-03 Thread Scott Ribe
Would a correct implementation not depend on being able to iterate over characters, and not simply using a fixed step size? Not in order to find line endings. Now, actually doing anything with the line of text is a different issue, dependent on the encoding. -- Scott Ribe

Re: Read lines from very large text file

2009-02-03 Thread Scott Ribe
Might it help to look at the source for 'more' and/or 'less' (the Unix utilities)? No idea whether they handle non-native line breaks competently. (Many many tools do not.) -- Scott Ribe scott_r...@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice

Re: Read lines from very large text file

2009-02-02 Thread Jacob Rhoden
Yea, I saw this and some posts on the apple forum saying NSInputStream is not the right way, hence the question, what is the right way to analyze a very large file line by line. cf the apple thread http://discussions.apple.com/thread.jspa?threadID=1187120tstart=400 On 3/2/09 12:57 AM,

Re: Read lines from very large text file

2009-02-02 Thread Robert Martin
Sorry - the link should have been: http://ridiculousfish.com/hexfiend/ On Feb 2, 2009, at 9:51 PM, Jacob Rhoden wrote: Yea, I saw this and some posts on the apple forum saying NSInputStream is not the right way, hence the question, what is the right way to analyze a very large file line by

Re: Read lines from very large text file

2009-02-02 Thread Kenneth Bruno II
On Feb 2, 2009, at 8:42 AM, Jacob Rhoden wrote: I am wondering what the best way to read a text file, line by line, when the file size is much larger than available memory. I know there are helper functions like stringWithContentsOfFile:encoding:error:, but this implies having to load

Re: Read lines from very large text file

2009-02-02 Thread Michael Ash
On Mon, Feb 2, 2009 at 3:51 PM, Jacob Rhoden li...@jacobrhoden.com wrote: Yea, I saw this and some posts on the apple forum saying NSInputStream is not the right way, hence the question, what is the right way to analyze a very large file line by line. cf the apple thread

Re: Read lines from very large text file

2009-02-02 Thread Joar Wingfors
On Feb 2, 2009, at 1:49 PM, Seth Willits wrote: I am wondering what the best way to read a text file, line by line, when the file size is much larger than available memory. Use mmap. Scan through the bytes to find line ranges, and create strings from there. Make sure it's deallocated when

Re: Read lines from very large text file

2009-02-02 Thread Kenneth Bruno II
On Feb 2, 2009, at 9:29 PM, Peter Duniho wrote: On Feb 2, 2009, at 6:02 PM, Seth Willits wrote: Before opening the file, either determine, guess, or be told what the encoding is. With that encoding, convert your delimiter string into raw bytes, then do byte-for-byte comparison on the file

Re: Read lines from very large text file

2009-02-02 Thread Peter Duniho
On Feb 2, 2009, at 6:56 PM, Kenneth Bruno II wrote: On Feb 2, 2009, at 9:29 PM, Peter Duniho wrote: Is there not a Cocoa class that handles character encoding and line- based reading from files, streams, etc.? And an equivalent one for writing? That seems like an odd omission for a

Re: Read lines from very large text file

2009-02-02 Thread Joar Wingfors
On Feb 2, 2009, at 6:02 PM, Seth Willits wrote: Before opening the file, either determine, guess, or be told what the encoding is. With that encoding, convert your delimiter string into raw bytes, then do byte-for-byte comparison on the file to find occurrences of that delimiter. How

Re: Read lines from very large text file

2009-02-02 Thread Peter Duniho
On Feb 2, 2009, at 7:50 PM, Joar Wingfors wrote: How do you know what delimiter string to use? Another thing that you'd have to determine, guess or be told, right? In general I would guess that it in this case almost always would be impossible and / or inappropriate to attempt to determine

Re: Read lines from very large text file

2009-02-02 Thread Greg Parker
On Feb 2, 2009, at 7:50 PM, Joar Wingfors wrote: On Feb 2, 2009, at 6:02 PM, Seth Willits wrote: Before opening the file, either determine, guess, or be told what the encoding is. With that encoding, convert your delimiter string into raw bytes, then do byte-for-byte comparison on the file

Re: Read lines from very large text file

2009-02-02 Thread Michael Ash
On Mon, Feb 2, 2009 at 8:53 PM, Joar Wingfors j...@joar.com wrote: On Feb 2, 2009, at 1:49 PM, Seth Willits wrote: I am wondering what the best way to read a text file, line by line, when the file size is much larger than available memory. Use mmap. Scan through the bytes to find line