On Wed, Aug 1, 2012 at 1:10 AM, <benjamin.bertr...@lfv.se> wrote:

> > There might be an easier way to do this with numpy dtypes.  In pseudo-
> > code:
> >
> > np.dtype([(colname, np.int16) for colname in colnames])
> >
>
> Can we use time and enum kinds that way as well?
>

Ahh you might not be able to use these types, depending on your numpy
version.   Instead I would use dictionary comprehensions of columns:

{colname: UInt16Col(pos=i) for i, colname in enumerate(colnames)}

This makes me think I should probably flatten my table.
> Having nested columns is quite natural to group cells under a data item
> (it's nice to access a field like I030_180/SPEED).
> But that's more a naming convention and I can't perform in-kernel searches
> with nested columns.
>

It would sure be nice if we could though ;)


>
> >
> > If you want to mark the whole column as valid, you can use a boolean
> > attribute on the table itself for each column.  They could be named
> > like colname_valid.
> >
> > See
> > http://pytables.github.com/usersguide/libref.html#tables.Leaf.setAttr
> > and http://pytables.github.com/usersguide/libref.html#the-attributeset-
> > class for more info.
>
> Good to know.
>
> >
> > This is more for flagging individual cells as valid or not.  For
> > integers you need  to pick a values which means invalid (like -999999).
> >
> >
>
> Yes, you are right. It's not a column I want to flag but a group of cells
> in a row (all cells of a particular data item).
> Problem is that if I have a uint8 for a cell, there is no invalid value I
> can use.
> Maybe I could use a "larger type" (uint16 in this case) to be able to pick
> an invalid value.
> Or a bool for this group of cells.
>

Yes, if you used a bool column next to it you could use this as a valid
mask.  Note that since bools are stored with a full byte, the bool column +
the uint8 column the same size as uint16.  However, it will be much quicker
to query over the bool column.

Be Well
Anthony


>
> >
> > If you don't want to use a VLArray, then maxlen is probably your best
> > option.
> >
> > If you want to do something a little more sophisticated, you could
> > break you data out into a main table and then a helper VLarray.  Every
> > row in the table is matched by the same row in vlarray.  Then when you
> > want to get your full data back out, you have to go to the table and
> > the vlarray.  This makes things a little more annoying to work with,
> > but it does what you want.
>
> Interesting. I'll look more into it.
>
> >
> > Hope this helps.  Feel free to ask more questions!
>
> Yes, it does.
> Thanks!
>
>
> Benjamin
>
>
> > Be Well
> > Anthony
> >
> >
> >
> >
> >
> >
> >       Cheers,
> >
> >
> >
> >       Benjamin
> >
> >
> >
> >
> >
> >       class I030_180_DESC(tables.IsDescription):
> >
> >           """Calculated Track Velocity (Polar)"""
> >
> >           SPEED = tables.UInt16Col(pos=0)
> >
> >           HEADING = tables.UInt16Col(pos=1)
> >
> >
> >
> >       class I030_181_DESC(tables.IsDescription):
> >
> >           """Calculated Track Velocity (Cartesian)"""
> >
> >           X = tables.Int16Col(pos=0)
> >
> >           Y = tables.Int16Col(pos=1)
> >
> >
> >
> >       class I030_340_DESC(tables.IsDescription):
> >
> >           """Last Measured Mode 3/A"""
> >
> >           V = tables.EnumCol(tables.Enum({
> >
> >               "Code validated": 0,
> >
> >               "Code not validated": 1,
> >
> >               "uninitialized": 255
> >
> >               }), "uninitialized",
> >
> >               base="uint8",
> >
> >               pos=0)
> >
> >           G = tables.EnumCol(tables.Enum({
> >
> >               "Default": 0,
> >
> >               "Garbled code": 1,
> >
> >               "uninitialized": 255
> >
> >               }), "uninitialized",
> >
> >               base="uint8",
> >
> >               pos=1)
> >
> >           L = tables.EnumCol(tables.Enum({
> >
> >               "MODE 3/A code as derived from the reply of the
> > transponder,": 0,
> >
> >               "Smoothed MODE 3/A code as provided by a local
> > tracker": 1
> >
> >               "uninitialized": 255
> >
> >               }), "uninitialized",
> >
> >               base="uint8",
> >
> >               pos=2)
> >
> >           sb = tables.UInt8Col(pos=3)
> >
> >           mode_3_a = tables.UInt16Col(pos=4)
> >
> >
> >
> >       class I030_400_DESC(tables.IsDescription):
> >
> >           """Callsign"""
> >
> >           callsign = tables.StringCol(7, pos=0)
> >
> >
> >
> >       class I030_050_DESC(tables.IsDescription):
> >
> >           """Artas Track Number"""
> >
> >           AUI = tables.UInt8Col(pos=0)
> >
> >           unused = tables.UInt8Col(pos=1)
> >
> >           STN = tables.UInt16Col(pos=2)
> >
> >           FX = tables.EnumCol(tables.Enum({
> >
> >               "end of data item": 0,
> >
> >               "extension into next extent": 1,
> >
> >               "uninitialized": 255
> >
> >               }), "uninitialized",
> >
> >               base="uint8",
> >
> >               pos=3)
> >
> >
> >
> >       class I030Record(tables.IsDescription):
> >
> >           """Cat 030 record"""
> >
> >           ff_timestamp = tables.Time32Col()
> >
> >           I030_010 = I030_010_DESC()
> >
> >           I030_015 = I030_015_DESC()
> >
> >           I030_030 = I030_030_DESC()
> >
> >           I030_035 = I030_035_DESC()
> >
> >           I030_040 = I030_040_DESC()
> >
> >           I030_070 = I030_070_DESC()
> >
> >           I030_170 = I030_170_DESC()
> >
> >           I030_100 = I030_100_DESC()
> >
> >           I030_180 = I030_180_DESC()
> >
> >           I030_181 = I030_181_DESC()
> >
> >           I030_060 = I030_060_DESC()
> >
> >           I030_150 = I030_150_DESC()
> >
> >           I030_140 = I030_140_DESC()
> >
> >           I030_340 = I030_340_DESC()
> >
> >           I030_400 = I030_400_DESC()
> >
> >       ...
> >
> >           I030_210 = I030_210_DESC()
> >
> >           I030_120 = I030_120_DESC()
> >
> >           I030_050 = I030_050_DESC()
> >
> >           I030_270 = I030_270_DESC()
> >
> >           I030_370 = I030_370_DESC()
> >
> >
> >
> >
> >
> >       Från: Anthony Scopatz [mailto:scop...@gmail.com]
> >       Skickat: den 12 juli 2012 00:02
> >       Till: Discussion list for PyTables
> >       Ämne: Re: [Pytables-users] advice on using PyTables
> >
> >
> >
> >       Hello Benjamin,
> >
> >
> >
> >       Not knowing to much about the ASTERIX format, other than
> > what you said and what is in the links, I would say that this is a good
> > fit for HDF5 and PyTables.  PyTables will certainly help you read in
> > the data and manipulate it.
> >
> >
> >
> >       However, before you abandon hachoir completely, I will say
> > it is a lot easier to write hdf5 files in PyTables than to use the HDF5
> > C API.   If hachoir is too slow, have you tried profiling the code to
> > see what is taking up the most time?  Maybe you could just rewrite
> > these parts in C?  Have you looked into Cythonizing it?  Also, you
> > don't seem to be using numpy to read in the data... (there are some
> > tricks given ASTERIX here, but not insurmountable).
> >
> >
> >
> >       I ask the above, just so you don't have to completely
> > rewrite everything.  You are correct though that pure python is
> > probably not sufficient.  Feel free to ask more questions here.
> >
> >
> >
> >       Be Well
> >
> >       Anthony
> >
> >
> >
> >       On Wed, Jul 11, 2012 at 6:52 AM, <benjamin.bertr...@lfv.se>
> > wrote:
> >
> >       Hi,
> >
> >       I'm working with Air Traffic Management and would like to
> > perform checks / compute statistics on ASTERIX data.
> >       ASTERIX is an ATM Surveillance Data Binary Messaging Format
> > (http://www.eurocontrol.int/asterix/public/standard_page/overview.html)
> >
> >       The data consist of a concatenation of consecutive data
> > blocks.
> >       Each data block consists of data category + length +
> > records.
> >       Each record is of variable length and consists of several
> > data items (that are well defined for each category).
> >       Some data items might be present or not depending on a field
> > specification (bitfield).
> >
> >       I started to write a parser using hachoir
> > (https://bitbucket.org/haypo/hachoir/overview) a pure python library.
> >       But the parsing was really too slow and taking a lot of
> > memory.
> >       That's not really useable.
> >
> >       >From what I read, PyTables could really help to manipulate
> > and analyze the data.
> >       So I've been thinking about writing a tool (probably in C)
> > to convert my ASTERIX format to HDF5.
> >
> >       Before I start, I'd like confirmation that this seems like a
> > suitable application for PyTables.
> >       Is there another approach than writing a conversion tool to
> > HDF5?
> >
> >       Thanks in advance
> >
> >       Benjamin
> >
> >
> >
> >
> >       ------------------------------------------------------------
> > ------------------
> >       Live Security Virtual Conference
> >       Exclusive live event will cover all the ways today's
> > security and
> >       threat landscape has changed and how IT managers can
> > respond. Discussions
> >       will include endpoint security, mobile security and the
> > latest in malware
> >       threats.
> > http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >       _______________________________________________
> >       Pytables-users mailing list
> >       Pytables-users@lists.sourceforge.net
> >       https://lists.sourceforge.net/lists/listinfo/pytables-users
> >
> >
> >
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to