Re: [julia-users] Read a stuctured binary file with big endian unsigned integers (4 bytes) and big endian floats (4 bytes)

2016-09-18 Thread Femto Trader
I think it's ok now...
https://github.com/femtotrader/DataReaders.jl/issues/14

but I'm still blocked because of LZMA compression
see https://github.com/yuyichao/LibArchive.jl/issues/2
and my 
question https://groups.google.com/forum/#!topic/julia-users/G9Pqe5svS3c

Any help will be great.

Le dimanche 18 septembre 2016 15:00:03 UTC+2, Tim Holy a écrit :
>
> See ntoh and hton. Perhaps even better, see StrPack.jl. 
>
> --Tim 
>
> On Sunday, September 18, 2016 1:13:43 AM CDT Femto Trader wrote: 
> > Hello, 
> > 
> > I'd like to read this file 
> > http://www.dukascopy.com/datafeed/EURUSD/2016/02/14/20h_ticks.bi5 
> > using Julia. 
> > 
> > It's a LZMA compressed file. 
> > 
> > I can decompressed it using 
> > cp 20h_ticks.bi5 20h_ticks.xz 
> > xz --decompress --format=lzma 20h_ticks.xz 
> > 
> > Now, I have a 20h_ticks binary file. 
> > 
> > It's a stuctured binary file with array of records 
> >   Date unsigned integer 4 bytes 
> >   Ask  unsigned integer 4 bytes 
> >   Bid  unsigned integer 4 bytes 
> >   AskVolume float 4 bytes 
> >   BidVolume float 4 bytes 
> > 
> > 
> > Using Python I'm able to read it and get a Pandas DataFrame 
> > 
> > import numpy as np 
> > import pandas as pd 
> > import datetime 
> > symb = "EURUSD" 
> > dt_chunk = datetime.datetime(2016, 2, 14) 
> > record_dtype = np.dtype([ 
> > ('Date', '>u4'), 
> > ('Ask', '>u4'), 
> > ('Bid', '>u4'), 
> > ('AskVolume', '>f4'), 
> > ('BidVolume', '>f4'), 
> > ]) 
> > 
> > data = np.fromfile("20h_ticks", dtype=record_dtype) 
> > columns = ["Date", "Ask", "Bid", "AskVolume", "BidVolume"] 
> > df = pd.DataFrame(data, columns=columns) 
> > if symb[3:] == "JPY": 
> > p_digits = 3 
> > else: 
> > p_digits = 5 
> > for p in ["Ask", "Bid"]: 
> > df[p] = df[p] / 10**p_digits 
> > df["Date"] = dt_chunk + pd.to_timedelta(df["Date"], unit="ms") 
> > df = df.set_index("Date") 
> > 
> > I'd like to do the same using Julia 
> > 
> > I did 
> > 
> > symb = "EURUSD" 
> > day_chunk = Date(2016, 2, 14) 
> > h_chunk = 20 
> > dt_chunk = DateTime(day_chunk) + Base.Dates.Hour(h_chunk) 
> > filename = @sprintf "%02dh_ticks" h_chunk 
> > println(filename) 
> > 
> > immutable TickRecordType 
> >   Date::UInt32 
> >   Ask::UInt32 
> >   Bid::UInt32 
> >   AskVolume::Float32 
> >   BidVolume::Float32 
> > end 
> > 
> > f = open(filename) 
> > 
> > # ... 
> > 
> > close(f) 
> > 
> > but I'm blocked now ... 
> > 
> > Any help will be great. 
> > 
> > Kind regards 
>
>
>

Re: [julia-users] Read a stuctured binary file with big endian unsigned integers (4 bytes) and big endian floats (4 bytes)

2016-09-18 Thread Tim Holy
See ntoh and hton. Perhaps even better, see StrPack.jl.

--Tim

On Sunday, September 18, 2016 1:13:43 AM CDT Femto Trader wrote:
> Hello,
> 
> I'd like to read this file
> http://www.dukascopy.com/datafeed/EURUSD/2016/02/14/20h_ticks.bi5
> using Julia.
> 
> It's a LZMA compressed file.
> 
> I can decompressed it using
> cp 20h_ticks.bi5 20h_ticks.xz
> xz --decompress --format=lzma 20h_ticks.xz
> 
> Now, I have a 20h_ticks binary file.
> 
> It's a stuctured binary file with array of records
>   Date unsigned integer 4 bytes
>   Ask  unsigned integer 4 bytes
>   Bid  unsigned integer 4 bytes
>   AskVolume float 4 bytes
>   BidVolume float 4 bytes
> 
> 
> Using Python I'm able to read it and get a Pandas DataFrame
> 
> import numpy as np
> import pandas as pd
> import datetime
> symb = "EURUSD"
> dt_chunk = datetime.datetime(2016, 2, 14)
> record_dtype = np.dtype([
> ('Date', '>u4'),
> ('Ask', '>u4'),
> ('Bid', '>u4'),
> ('AskVolume', '>f4'),
> ('BidVolume', '>f4'),
> ])
> 
> data = np.fromfile("20h_ticks", dtype=record_dtype)
> columns = ["Date", "Ask", "Bid", "AskVolume", "BidVolume"]
> df = pd.DataFrame(data, columns=columns)
> if symb[3:] == "JPY":
> p_digits = 3
> else:
> p_digits = 5
> for p in ["Ask", "Bid"]:
> df[p] = df[p] / 10**p_digits
> df["Date"] = dt_chunk + pd.to_timedelta(df["Date"], unit="ms")
> df = df.set_index("Date")
> 
> I'd like to do the same using Julia
> 
> I did
> 
> symb = "EURUSD"
> day_chunk = Date(2016, 2, 14)
> h_chunk = 20
> dt_chunk = DateTime(day_chunk) + Base.Dates.Hour(h_chunk)
> filename = @sprintf "%02dh_ticks" h_chunk
> println(filename)
> 
> immutable TickRecordType
>   Date::UInt32
>   Ask::UInt32
>   Bid::UInt32
>   AskVolume::Float32
>   BidVolume::Float32
> end
> 
> f = open(filename)
> 
> # ...
> 
> close(f)
> 
> but I'm blocked now ...
> 
> Any help will be great.
> 
> Kind regards