Hi Milan,
No worries, greatly appreciate that you have taken time to help, will also
do as you advised.
Thanks.
Keith
On Saturday, 7 February 2015 08:17:51 UTC-8, Milan Bouchet-Valat wrote:
>
> Le vendredi 06 février 2015 à 15:01 -0800, Keith Kee a écrit :
> > Hi Milan,
> >
> >
> > Thanks for your advice.
> >
> >
> > I spotted one corruption in a smaller sample of 3000 lines and then it
> > worked.
> >
> >
> > Then a tried a larger number of 10000 lines and it gave the following:
> > Saw 10000 rows, 4 columns (correct) and 40022 fields*Line 1 has 6
> > columns (not sure where "line 1" starts but line 1 was ok as per using
> > only 3000 lines file)
> >
> >
> > How do I find the corruptions using the above message? Clearly it
> > detected 6 columns in some "Line 1", but it is not the first line.
> >
> >
> > Are there any julia functions or packages I can use to clean up the
> > data or that will highlight corrupted lines in the data.
> >
> >
> > I did try loading the 15,000 line csv file into excel and it worked
> > fine there.
> >
> >
> > Looking forward to your expert advice.
> Sorry, I'm not really an expert of that function. Can't you identify the
> problematic line by continuing to split the file into halves?
>
> Anyway, you should file a bug against the DataFrames package on GitHub,
> people will be more knowledgeable, and there's apparently a bug at least
> in the line number that is being reported.
>
>
> Regards
>
> > Thanks.
> >
> >
> > Keith
> >
> > On Friday, 6 February 2015 12:19:55 UTC-8, Milan Bouchet-Valat wrote:
> > Le vendredi 06 février 2015 à 11:12 -0800, Keith Kee a
> > écrit :
> > > Hi
> > >
> > >
> > > Using DataFrames ( v"0.6.0" ) and Win32 julia 0.3.5
> > >
> > >
> > > ds = readtable("EURUSD.CSV", header=false)
> > >
> > >
> > >
> > > results in
> > >
> > >
> > >
> > > BoundsError()
> > > in findcorruption at io.jl:698
> > > in readtable! at io.jl:779
> > > in readtable at io.jl:893
> > >
> > >
> > > The original file has 15000 lines, works when I cut it down
> > to 10
> > > lines.
> > >
> > >
> > > Please advise as to whether there are limits to readtable on
> > win32
> > > setups?
> > 15000 sounds quite small even for 32-bit. More likely, the
> > file contains
> > something readtable() doesn't like, and which does not appear
> > in the
> > first 10 lines. You could try removing half of the file, see
> > if it
> > works, and go on like that until you (possibly) find out which
> > line
> > creates a bug.
> >
> >
> > Regards
>
>
>
>