subject:"RE\: \[Jprogramming\] Scanning a large file"

RE: [Jprogramming] Scanning a large file

2006-05-17 Thread Oleg Kobchenko

In Office 2003 for Windows, it evetually opens the file fine, only curses beforehands. SYLK appears to be a very important format for exchange between spreadsheet(-gnostic) software, like RTF for documents and WMF for pictures. It has CF_ code number 4 just after METAFILE.

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Chris Burke

Oleg Kobchenko wrote: We need a general purpose read line functionality. It is common in C runtime and in other languages. Although, it is possible to do in J, but it's better not to do the low-level stuff every time. I suggest that we add two new definitions to the files script. One is

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Yoel Jacobsen

I have tryed it on a 1.2GB file. Since my laptop has only 1GB RAM I have killed the process when it consumed 500MB (and rising). Yoel On 5/15/06, Henry Rich [EMAIL PROTECTED] wrote: Try x ([: I. E.) y -- For information

RE: [Jprogramming] Scanning a large file

2006-05-16 Thread Miller, Raul D

Chris Burke wrote: if. len #dat do. if. p s do. dat=. dat, LF else. 'file not in LF-delimited lines' 13!:8[3 Note that this assumes that the last line of the file is terminated by a line feed. Otherwise, there can be a spurious error if the file is slightly larger

RE: [Jprogramming] Scanning a large file

2006-05-16 Thread Joey K Tuttle

At 09:38 -0400 2006/05/16, Miller, Raul D wrote: Chris Burke wrote: if. len #dat do. if. p s do. dat=. dat, LF else. 'file not in LF-delimited lines' 13!:8[3 Note that this assumes that the last line of the file is terminated by a line feed. Otherwise, there can

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Alain Miville de Chêne

It is all relative. The LF can be seen (as you do) as end of line or as new line. In the first case, all lines should end with end of line. In the second, LF cuts one line from another. When editing a text file, and requesting to place the cursor at end of file, with no LF at the end the

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Joey K Tuttle

Certainly, in my experience, LF, CR, or CRLF are considered as EOL (in ..IX, MAC, PC OSs). Going way back, these things came from input devices such as the IBM 1050 which was an early typewriter terminal. It had the charming attribute that the return key did just that (returned the carriage as on

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Oleg Kobchenko

These are interesting stories about line terminators. I agree on providing all the data. But I think absence of final terminator is more a stylistic issue (or a matter of choice) than a defect. Hence, it more like truthful conveying than alerting cleanliness. Here's on cygwin: [EMAIL PROTECTED]

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Joey K Tuttle

OK, MS (not bashing women :) Excel - the problem is, one often doesn't have the choice not to use it in the sense that people send files exported from Excel... A case where you can choose not to use it includes things like trying to use Excel to open a text file that starts with the ascii

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Alain Miville de Chêne

Our company is entirely using OpenOffice. It is a mature product to replace MS Office. Joey K Tuttle wrote: OK, MS (not bashing women :) Excel - the problem is, one often doesn't have the choice not to use it in the sense that people send files exported from Excel... ...

RE: [Jprogramming] Scanning a large file

2006-05-16 Thread Joey K Tuttle

At 15:29 -0400 2006/05/16, Miller, Raul D wrote: Joey K Tuttle wrote: OK, MS (not bashing women :) Excel - the problem is, one often doesn't have the choice not to use it in the sense that people send files exported from Excel... And sometimes those files are broken or virus infected,

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Chris Burke

Miller, Raul D wrote: Chris Burke wrote: if. len #dat do. if. p s do. dat=. dat, LF else. 'file not in LF-delimited lines' 13!:8[3 Note that this assumes that the last line of the file is terminated by a line feed. Otherwise, there can be a spurious error if the

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Chris Burke

Oleg Kobchenko wrote: It's a great idea to include line reading into a standard library. Here is a few comments. There are two differences from the original readlines: - overlapped reading (not once and only once) (with asserting presence of LF in current block) - automatic removal

RE: [Jprogramming] Scanning a large file

2006-05-16 Thread Miller, Raul D

Chris Burke wrote: I am in two minds on the buffer. It does impact performance, though not by much. But it means that after the block of 1e6 bytes is read in, it is immediately copied because it is appended to the tail of the previous block. So the question is whether this performance hit is

Re: [Jprogramming] Scanning a large file

2006-05-16 Thread Oleg Kobchenko

I am not sure about overlapped either. Raul's idea about special-casing sounds good. And the discussion on spread of copy. In my test, the impact was 5-7% or so -- a good price for streaming. I think the bottle neck is in looping in u;.2 and the line proc itself. I ran the UNIX wc, and it

RE: [Jprogramming] Scanning a large file

2006-05-16 Thread Joey K Tuttle

At 20:54 -0700 2006/05/16, Oleg Kobchenko wrote: http://support.microsoft.com/kb/215591/ ID,NAME 666,MS Don' B H8N Yes - I knew the workaround and even puzzled out that the origination of the bug is that SYLK files begin with ID;. You would think that some bright programmer could

RE: [Jprogramming] Scanning a large file

2006-05-15 Thread Henry Rich

AM To: Programming forum Subject: Re: [Jprogramming] Scanning a large file It won't work for large files. E. returns a 'limit error'. Yoel On 5/14/06, Joey K Tuttle [EMAIL PROTECTED] wrote: Yoel, Some of the feedback you got suggested mapped files, others suggested just reading

Re: [Jprogramming] Scanning a large file

2006-05-15 Thread Yoel Jacobsen

It won't work for large files. E. returns a 'limit error'. Yoel On 5/14/06, Joey K Tuttle [EMAIL PROTECTED] wrote: Yoel, Some of the feedback you got suggested mapped files, others suggested just reading the file. My own habits lean towards reading the file and I have a utility verb that

Re: [Jprogramming] Scanning a large file

2006-05-15 Thread Joey K Tuttle

At 12:14 -0300 2006/05/15, Randy MacDonald wrote: limit is only 2GB A phrase I thought I'd _never_ hear --- indeed ... and presumably not applicative in 64 bit systems...

Re: [Jprogramming] Scanning a large file

2006-05-15 Thread Yoel Jacobsen

Some of the clients of the company I'm working for, are working with up to Terabyte-long files. Usually in Physics, life-science,simulation etc. The new file system in Solaris (ZFS) is a 128bit FS. Anyway, data mining from log files is an important use of a language for me. I am very pleased

Re: [Jprogramming] Scanning a large file

2006-05-14 Thread Björn Helgason

Answers to you questions are as youself point out in memory mapped files Read the labs and experiment with them 2006/5/14, Yoel Jacobsen [EMAIL PROTECTED]: Hello, I'm new to J so please forgive me if this is a FAQ. I wrote some short sentences to parse a log file. I want to retrieve all the

Re: [Jprogramming] Scanning a large file

2006-05-14 Thread Yoel Jacobsen

I probably was not clear. My question is not how to use mapped files, but where to go from there. Mapped files does not solve the problem directly since I can't use the same algorithm on it. For instance, catopen would take tremendous time and space. Moreover, since the length of the lines is

Re: [Jprogramming] Scanning a large file

2006-05-14 Thread Björn Helgason

You may not have understood what mapped files are You do not read them into the workarea Opening a mapped file takes a very short time catopen you mention is probably reading all the data into the workarea and that is not the way mapped files will help your case As you see in the mapped file labs

Re: [Jprogramming] Scanning a large file

2006-05-14 Thread Chris Burke

Yoel Jacobsen wrote: I wrote some short sentences to parse a log file. I want to retrieve all the unique values of some attribute. The way it shows in the log file is attribute nameSPACEattribute value such as . csn 92892849893284 ... My initial (brute force) program is: text =:

Re: [Jprogramming] Scanning a large file

2006-05-14 Thread bill lam

If the file is really large, I prefer regex instead. -- For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Scanning a large file

2006-05-14 Thread Yoel Jacobsen

Even for regex, I don't see how to avoid manually reading the file in chunks which is too imperative style for me. Again, consider the Python example: for line in file.readlines(): match_object = re.search((= csn )\w+, line) if match_object: process(match_object.group(0)) The regex can be

Re: [Jprogramming] Scanning a large file

2006-05-14 Thread Oleg Kobchenko

We need a general purpose read line functionality. It is common in C runtime and in other languages. Although, it is possible to do in J, but it's better not to do the low-level stuff every time. Chris has shown how to do it in a way specific for a concrete example. It is suggested to separate

RE: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

RE: [Jprogramming] Scanning a large file

RE: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

RE: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

RE: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

RE: [Jprogramming] Scanning a large file

RE: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

Re: [Jprogramming] Scanning a large file

27 matches

Site Navigation

Mail list logo

Footer information