In Office 2003 for Windows, it evetually opens
the file fine, only curses beforehands.
SYLK appears to be a very important format for
exchange between spreadsheet(-gnostic) software,
like RTF for documents and WMF for pictures.
It has CF_ code number 4 just after METAFILE.
Oleg Kobchenko wrote:
We need a general purpose read line functionality.
It is common in C runtime and in other languages.
Although, it is possible to do in J, but it's better not
to do the low-level stuff every time.
I suggest that we add two new definitions to the files script. One is
I have tryed it on a 1.2GB file. Since my laptop has only 1GB RAM I have
killed the process when it consumed 500MB (and rising).
Yoel
On 5/15/06, Henry Rich [EMAIL PROTECTED] wrote:
Try
x ([: I. E.) y
--
For information
Chris Burke wrote:
if. len #dat do.
if. p s do.
dat=. dat, LF
else.
'file not in LF-delimited lines' 13!:8[3
Note that this assumes that the last line of the file is
terminated by a line feed. Otherwise, there can be a
spurious error if the file is slightly larger
At 09:38 -0400 2006/05/16, Miller, Raul D wrote:
Chris Burke wrote:
if. len #dat do.
if. p s do.
dat=. dat, LF
else.
'file not in LF-delimited lines' 13!:8[3
Note that this assumes that the last line of the file is
terminated by a line feed. Otherwise, there can
It is all relative.
The LF can be seen (as you do) as end of line or as new line.
In the first case, all lines should end with end of line.
In the second, LF cuts one line from another.
When editing a text file, and requesting to place the cursor at end of
file, with no LF at the end the
Certainly, in my experience, LF, CR, or CRLF are considered
as EOL (in ..IX, MAC, PC OSs). Going way back, these things
came from input devices such as the IBM 1050 which was an
early typewriter terminal. It had the charming attribute
that the return key did just that (returned the carriage
as on
These are interesting stories about line terminators.
I agree on providing all the data.
But I think absence of final terminator is more
a stylistic issue (or a matter of choice) than a defect.
Hence, it more like truthful conveying than alerting
cleanliness.
Here's on cygwin:
[EMAIL PROTECTED]
OK, MS (not bashing women :) Excel - the problem is,
one often doesn't have the choice not to use it in
the sense that people send files exported from Excel...
A case where you can choose not to use it includes things
like trying to use Excel to open a text file that starts
with the ascii
Our company is entirely using OpenOffice. It is a mature product to
replace MS Office.
Joey K Tuttle wrote:
OK, MS (not bashing women :) Excel - the problem is,
one often doesn't have the choice not to use it in
the sense that people send files exported from Excel...
...
At 15:29 -0400 2006/05/16, Miller, Raul D wrote:
Joey K Tuttle wrote:
OK, MS (not bashing women :) Excel - the problem is,
one often doesn't have the choice not to use it in
the sense that people send files exported from Excel...
And sometimes those files are broken or virus infected,
Miller, Raul D wrote:
Chris Burke wrote:
if. len #dat do.
if. p s do.
dat=. dat, LF
else.
'file not in LF-delimited lines' 13!:8[3
Note that this assumes that the last line of the file is
terminated by a line feed. Otherwise, there can be a
spurious error if the
Oleg Kobchenko wrote:
It's a great idea to include line reading
into a standard library. Here is a few comments.
There are two differences from the original
readlines:
- overlapped reading (not once and only once)
(with asserting presence of LF in current block)
- automatic removal
Chris Burke wrote:
I am in two minds on the buffer. It does impact performance, though not
by much. But it means that after the block of 1e6 bytes is read in, it
is immediately copied because it is appended to the tail of the previous
block. So the question is whether this performance hit is
I am not sure about overlapped either. Raul's idea about
special-casing sounds good. And the discussion on
spread of copy. In my test, the impact was 5-7%
or so -- a good price for streaming.
I think the bottle neck is in looping in u;.2
and the line proc itself.
I ran the UNIX wc, and it
At 20:54 -0700 2006/05/16, Oleg Kobchenko wrote:
http://support.microsoft.com/kb/215591/
ID,NAME
666,MS
Don' B H8N
Yes - I knew the workaround and even puzzled out that
the origination of the bug is that SYLK files begin with
ID;. You would think that some bright programmer could
AM
To: Programming forum
Subject: Re: [Jprogramming] Scanning a large file
It won't work for large files. E. returns a 'limit error'.
Yoel
On 5/14/06, Joey K Tuttle [EMAIL PROTECTED] wrote:
Yoel,
Some of the feedback you got suggested mapped files, others
suggested just reading
It won't work for large files. E. returns a 'limit error'.
Yoel
On 5/14/06, Joey K Tuttle [EMAIL PROTECTED] wrote:
Yoel,
Some of the feedback you got suggested mapped files, others
suggested just reading the file. My own habits lean towards
reading the file and I have a utility verb that
At 12:14 -0300 2006/05/15, Randy MacDonald wrote:
limit is only 2GB
A phrase I thought I'd _never_ hear
---
indeed ... and presumably not applicative in 64 bit systems...
Some of the clients of the company I'm working for, are working with up to
Terabyte-long files. Usually in Physics, life-science,simulation etc.
The new file system in Solaris (ZFS) is a 128bit FS.
Anyway, data mining from log files is an important use of a language for me.
I am very pleased
Answers to you questions are as youself point out in memory mapped files
Read the labs and experiment with them
2006/5/14, Yoel Jacobsen [EMAIL PROTECTED]:
Hello,
I'm new to J so please forgive me if this is a FAQ.
I wrote some short sentences to parse a log file. I want to retrieve all
the
I probably was not clear.
My question is not how to use mapped files, but where to go from there.
Mapped files does not solve the problem directly since I can't use the same
algorithm on it.
For instance, catopen would take tremendous time and space. Moreover, since
the length of the lines is
You may not have understood what mapped files are
You do not read them into the workarea
Opening a mapped file takes a very short time
catopen you mention is probably reading all the data into the workarea and
that is not the way mapped files will help your case
As you see in the mapped file labs
Yoel Jacobsen wrote:
I wrote some short sentences to parse a log file. I want to retrieve all
the
unique values of some attribute. The way it shows in the log file is
attribute nameSPACEattribute value such as . csn 92892849893284
...
My initial (brute force) program is:
text =:
If the file is really large, I prefer regex instead.
--
For information about J forums see http://www.jsoftware.com/forums.htm
Even for regex, I don't see how to avoid manually reading the file in chunks
which is too imperative style for me. Again, consider the Python example:
for line in file.readlines():
match_object = re.search((= csn )\w+, line)
if match_object:
process(match_object.group(0))
The regex can be
We need a general purpose read line functionality.
It is common in C runtime and in other languages.
Although, it is possible to do in J, but it's better not
to do the low-level stuff every time.
Chris has shown how to do it in a way specific for
a concrete example. It is suggested to separate
27 matches
Mail list logo