Re: [julia-users] Read a stuctured binary file with big endian unsigned integers (4 bytes) and big endian floats (4 bytes)
I think it's ok now... https://github.com/femtotrader/DataReaders.jl/issues/14 but I'm still blocked because of LZMA compression see https://github.com/yuyichao/LibArchive.jl/issues/2 and my question https://groups.google.com/forum/#!topic/julia-users/G9Pqe5svS3c Any help will be great. Le dimanche 18 septembre 2016 15:00:03 UTC+2, Tim Holy a écrit : > > See ntoh and hton. Perhaps even better, see StrPack.jl. > > --Tim > > On Sunday, September 18, 2016 1:13:43 AM CDT Femto Trader wrote: > > Hello, > > > > I'd like to read this file > > http://www.dukascopy.com/datafeed/EURUSD/2016/02/14/20h_ticks.bi5 > > using Julia. > > > > It's a LZMA compressed file. > > > > I can decompressed it using > > cp 20h_ticks.bi5 20h_ticks.xz > > xz --decompress --format=lzma 20h_ticks.xz > > > > Now, I have a 20h_ticks binary file. > > > > It's a stuctured binary file with array of records > > Date unsigned integer 4 bytes > > Ask unsigned integer 4 bytes > > Bid unsigned integer 4 bytes > > AskVolume float 4 bytes > > BidVolume float 4 bytes > > > > > > Using Python I'm able to read it and get a Pandas DataFrame > > > > import numpy as np > > import pandas as pd > > import datetime > > symb = "EURUSD" > > dt_chunk = datetime.datetime(2016, 2, 14) > > record_dtype = np.dtype([ > > ('Date', '>u4'), > > ('Ask', '>u4'), > > ('Bid', '>u4'), > > ('AskVolume', '>f4'), > > ('BidVolume', '>f4'), > > ]) > > > > data = np.fromfile("20h_ticks", dtype=record_dtype) > > columns = ["Date", "Ask", "Bid", "AskVolume", "BidVolume"] > > df = pd.DataFrame(data, columns=columns) > > if symb[3:] == "JPY": > > p_digits = 3 > > else: > > p_digits = 5 > > for p in ["Ask", "Bid"]: > > df[p] = df[p] / 10**p_digits > > df["Date"] = dt_chunk + pd.to_timedelta(df["Date"], unit="ms") > > df = df.set_index("Date") > > > > I'd like to do the same using Julia > > > > I did > > > > symb = "EURUSD" > > day_chunk = Date(2016, 2, 14) > > h_chunk = 20 > > dt_chunk = DateTime(day_chunk) + Base.Dates.Hour(h_chunk) > > filename = @sprintf "%02dh_ticks" h_chunk > > println(filename) > > > > immutable TickRecordType > > Date::UInt32 > > Ask::UInt32 > > Bid::UInt32 > > AskVolume::Float32 > > BidVolume::Float32 > > end > > > > f = open(filename) > > > > # ... > > > > close(f) > > > > but I'm blocked now ... > > > > Any help will be great. > > > > Kind regards > > >
Re: [julia-users] Read a stuctured binary file with big endian unsigned integers (4 bytes) and big endian floats (4 bytes)
See ntoh and hton. Perhaps even better, see StrPack.jl. --Tim On Sunday, September 18, 2016 1:13:43 AM CDT Femto Trader wrote: > Hello, > > I'd like to read this file > http://www.dukascopy.com/datafeed/EURUSD/2016/02/14/20h_ticks.bi5 > using Julia. > > It's a LZMA compressed file. > > I can decompressed it using > cp 20h_ticks.bi5 20h_ticks.xz > xz --decompress --format=lzma 20h_ticks.xz > > Now, I have a 20h_ticks binary file. > > It's a stuctured binary file with array of records > Date unsigned integer 4 bytes > Ask unsigned integer 4 bytes > Bid unsigned integer 4 bytes > AskVolume float 4 bytes > BidVolume float 4 bytes > > > Using Python I'm able to read it and get a Pandas DataFrame > > import numpy as np > import pandas as pd > import datetime > symb = "EURUSD" > dt_chunk = datetime.datetime(2016, 2, 14) > record_dtype = np.dtype([ > ('Date', '>u4'), > ('Ask', '>u4'), > ('Bid', '>u4'), > ('AskVolume', '>f4'), > ('BidVolume', '>f4'), > ]) > > data = np.fromfile("20h_ticks", dtype=record_dtype) > columns = ["Date", "Ask", "Bid", "AskVolume", "BidVolume"] > df = pd.DataFrame(data, columns=columns) > if symb[3:] == "JPY": > p_digits = 3 > else: > p_digits = 5 > for p in ["Ask", "Bid"]: > df[p] = df[p] / 10**p_digits > df["Date"] = dt_chunk + pd.to_timedelta(df["Date"], unit="ms") > df = df.set_index("Date") > > I'd like to do the same using Julia > > I did > > symb = "EURUSD" > day_chunk = Date(2016, 2, 14) > h_chunk = 20 > dt_chunk = DateTime(day_chunk) + Base.Dates.Hour(h_chunk) > filename = @sprintf "%02dh_ticks" h_chunk > println(filename) > > immutable TickRecordType > Date::UInt32 > Ask::UInt32 > Bid::UInt32 > AskVolume::Float32 > BidVolume::Float32 > end > > f = open(filename) > > # ... > > close(f) > > but I'm blocked now ... > > Any help will be great. > > Kind regards
[julia-users] Read a stuctured binary file with big endian unsigned integers (4 bytes) and big endian floats (4 bytes)
Hello, I'd like to read this file http://www.dukascopy.com/datafeed/EURUSD/2016/02/14/20h_ticks.bi5 using Julia. It's a LZMA compressed file. I can decompressed it using cp 20h_ticks.bi5 20h_ticks.xz xz --decompress --format=lzma 20h_ticks.xz Now, I have a 20h_ticks binary file. It's a stuctured binary file with array of records Date unsigned integer 4 bytes Ask unsigned integer 4 bytes Bid unsigned integer 4 bytes AskVolume float 4 bytes BidVolume float 4 bytes Using Python I'm able to read it and get a Pandas DataFrame import numpy as np import pandas as pd import datetime symb = "EURUSD" dt_chunk = datetime.datetime(2016, 2, 14) record_dtype = np.dtype([ ('Date', '>u4'), ('Ask', '>u4'), ('Bid', '>u4'), ('AskVolume', '>f4'), ('BidVolume', '>f4'), ]) data = np.fromfile("20h_ticks", dtype=record_dtype) columns = ["Date", "Ask", "Bid", "AskVolume", "BidVolume"] df = pd.DataFrame(data, columns=columns) if symb[3:] == "JPY": p_digits = 3 else: p_digits = 5 for p in ["Ask", "Bid"]: df[p] = df[p] / 10**p_digits df["Date"] = dt_chunk + pd.to_timedelta(df["Date"], unit="ms") df = df.set_index("Date") I'd like to do the same using Julia I did symb = "EURUSD" day_chunk = Date(2016, 2, 14) h_chunk = 20 dt_chunk = DateTime(day_chunk) + Base.Dates.Hour(h_chunk) filename = @sprintf "%02dh_ticks" h_chunk println(filename) immutable TickRecordType Date::UInt32 Ask::UInt32 Bid::UInt32 AskVolume::Float32 BidVolume::Float32 end f = open(filename) # ... close(f) but I'm blocked now ... Any help will be great. Kind regards