Hi all,

Maybe I am missing something myself...  I too seem to find the Julia 
textfile reading facilities inadequate (or maybe I am using them wrong).

As Jacob Quinn suggested, I tried using readdlm, but it only seems to work 
for text files with a regular structure.  For example:
#Comments at start
#Comments at start
...
#Comments at start

#Regular table of data:
data1, data2, data3, ... dataN
data1, data2, data3, ... dataN
data1, data2, data3, ... dataN

..But alot of existing file formats are *not* this regular

..So I have developped a temporary workaround (still waiting for someone to 
show me what I am doing wrong).

*Solution*
My solution is not as flexible as the scanf() tool, but it works well with 
the 90% of file formats that don't need that level of control.  It is 
provided as part of the FileIO2 module:
https://github.com/ma-laforge/FileIO2.jl

The solution is based on a new type called "TextReader", constructed with:
reader = TextReader(::IO, splitter= [' ', '\t'])
(As with readdlm, you can add ',' to splitter for CSV files)

The goal of the TextReader object is to take control of an ::IO stream, and 
provide higher-level read functionnality.

To use it, you can open a text file using the "Base.open" method:
reader = open(FileIO2.TextReader, "MYTEXTFILE.TXT")

You can therfore read a regular file with the following pattern:
while !eof(reader)
    data = read(reader, Int)
    println("Read: $(typeof(data))(`$data`)")
end

And you can read a more irregular data with the following:
dataid = read(reader, AbstractString)
dataval = read(reader, Int)
dataid = read(reader, AbstractString)
dataval = read(reader, Int)
...

Note that the current method for `read(stream::IO, DataType)` considers the 
data stream to be binary... not text.  However, by dispatching on a 
TextReader object, we can re-define the behaviour to be more appropriate 
for the task at hand.

You can also let parse() auto-detect the type, for quick-and-dirty 
solutions:
dataany = read(reader, Any) #Might return an Int, String, Float, ...

Feel free to use it... but I suggest making a copy of the code.  The 
FileIO2 interface is still in the experimental phase. It *will* 
change/break your code.

*FileIO2 Comment*
The solution in the FileIO2.jl module looks a bit complicated because it 
ties into a File{}-type-based-system.  The different File{} types are used 
by Julia's dispatch system to call the appropriate open/read/write method 
(unique Julia signature).  This is not possible when using simple Strings 
to pass the filename.

FYI: Using the higher level API, one can use the simple, high-level call:
reader = open(File(:text, "MYTEXTFILE.TXT"))

This call uses Julia itrospection to detect text readers from any currently 
loaded module.

Note that only the "FileIO2.TextReader" object performs this service at the 
moment...


Regards,

MA


On Friday, March 7, 2014 at 4:20:30 PM UTC-5, Jacob Quinn wrote:
>
> Have you checked out `readdlm`? It's flexibility has grown over time and 
> serves most of my needs.
>
> http://docs.julialang.org/en/latest/stdlib/base/#Base.readdlm
>
> -Jacob
>

Reply via email to