Other people at my firm who know a lot about binary files couldn't figure
out the parts of the file that I am skipping over. Part of the issue is
that there are several different files (dbs extension files) like this that
I have to process and the structures do change depending on the source of
these files.

In short, the problem is over my head and I was hoping to go right to the
correct bit and read, which would make things much easier. I guess not...
Thanks for your help though.

Anyone else?

thanks,

ben

On Tue, Jun 19, 2012 at 10:10 AM, jim holtman <jholt...@gmail.com> wrote:

> I am not sure why reading through 'bit-by-bit' gets you to where you
> want to be.  I assume that the file has some structure, even though it
> may be changing daily.  You mentioned the various types of data that
> it might contain; are they all in 'byte' sized chucks?  If you really
> have data that begins in the middle of a byte and then extends over
> several bytes, you will have to write some functions that will pull
> out this data and then reconstruct it into an object (e.g., integer,
> numeric, ...) that R understands.  Can you provide some more
> definition of what the data actually looks like and how you would find
> the "pattern" of the data.  Almost all systems read at the lowest
> level byte sized chucks, and if you really have to get down to the bit
> level to reconstruct the data, then you have to write the unpack/pack
> functions.  This can all be done once you understand the structure of
> the data.  So some examples would be useful if you want someone to
> propose a solution.
>
> On Tue, Jun 19, 2012 at 11:54 AM, Ben quant <ccqu...@gmail.com> wrote:
> > Hello,
> >
> > Has a function been built that will skip to a certain bit in a binary
> file?
> >
> > As of 2009 the answer was 'no':
> > http://r.789695.n4.nabble.com/read-binary-file-seek-td900847.html
> > https://stat.ethz.ch/pipermail/r-help/2009-May/199819.html
> >
> > If you feel I don't need to (like in the links above), please provide
> some
> > help. (Note this is my first time working with binary files.)
> >
> > I'm still working on the script, but here is where I am right now. The
> for
> > loop is being used because:
> >
> > 1) I have to get down to correct position then get the info I want/need.
> > The stuff I am reading through (x) is not fully understood and it is a
> mix
> > of various chars, floats, integers, etc. of various sizes etc. so I don't
> > know who many bytes to read in unless I read them bit by bit. (The
> > information and structure of the information changes daily so I'm
> skipping
> > over it.)
> > 2) If I skip all in one readBin() my 'n' value is often up to 20 times
> too
> > big (I get an error) and/or R won't let me "allocate a vector of
> size...."
> > etc. So I split it up into chunks (divide by 20 etc.) and read each chuck
> > then trash each part that is readBin()'d. Then the last line I get the
> data
> > that I want (data1).
> >
> > Here is my working code:
> >
> > # I have to read 'junk' bits from the to.read file which is huge integer
> so
> > I divide it up and loop through to.read in parts (jb_part).
> >  divr = 20
> >  mod = junk %% divr
> >
> >  jb_part = as.integer(junk/divr)
> >  jb_part_mod = jb_part + mod # catch the remainder/modulus
> >
> >  to.read = file(paste(dbs_path,"/",dbs_file,sep=""),"rb") # connect to
> the
> > binary file
> > # loop in chunks to where I want to be
> >  for(i in 1:(divr-1)){
> >    x = readBin(to.read,"raw",n=jb_part,size=1)
> >    x = NULL # trash the result b/c I don't want it
> >  }
> > # read a a little more to include the remainder/modulus bits left over by
> > dividing by 20 above
> >  x = readBin(to.read,'raw',n=jb_part_mod,size=1)
> >  x = NULL # trash it
> >
> > # finally get the data that I want
> > data1 = readBin(to.read,double(),n=some_number,size=size_to_use)
> >
> > This works, but it is SLOW!  Any ideas on how to get down to the correct
> > bit a bit quicker (pun intended). :)
> >
> > Thanks for any help!
> >
> > Ben
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to