Re: [Pytables-users] Expr performance with Tables on multicore machines

Francesc Alted Mon, 14 May 2012 14:12:01 -0700

On 5/14/12 3:12 PM, Anthony Scopatz wrote:

On Mon, May 14, 2012 at 3:05 PM, Francesc Alted <fal...@pytables.org<mailto:fal...@pytables.org>> wrote:
[snip]

    However, do not expect to use all your cores at full speed in this
    cases, as the reductions in numexpr can only make use of one
    thread (this is because this has not been implemented yet, not due
    to a intrinsic limitation of numexpr).


Hello Francesc,
Not to side track the discussion too much, but is there a ticket openfor this in numexpr?


There is:

http://code.google.com/p/numexpr/issues/detail?id=73

It seems that at least for certain reductions (sum, mult, etc),splitting this up over many cores would be pretty easy. I may towrong about this though ;)

Apparently should be easy, but the reality proves it to be a bit harder;) I remember to spent some quality time on this, but did not get ableto solve the problem. But it is *solvable* for sure.


Anyway, after looking into the ticket above, the next code could be faster:

    fn_str = '(a - (b + %s))**2' % db
    expr = Expr(fn_str,uservars=uv)
    # returning the "sum of squares"
    return expr.eval().sum()

Which is basically what you was suggesting: using numpy.sum(). Butdefinitely, the elegant solution would be to make reductions usemultiple cores on numexpr.


Francesc


Be Well
Anthony


    Francesc


    I hope this helps.  If you need other tips on speeding up the
    sum operation, please let us know.

    Be Well
    Anthony

    Timer unit: 1e-06 s

    File: pytables_expr_test.py
    Function: fn at line 66
    Total time: 1.63254 s

    Line #      Hits         Time  Per Hit   % Time  Line Contents
    ==============================================================
        66                                           def fn(p, h5table):
        67                                               '''
        68                                                   actual
    function we are going to minimize. It consists of
        69                                                   the
    pytables Table object and a list of parameters.
        70                                               '''
        71         1           14     14.0      0.0      uv =
    h5table.colinstances
        72
        73                                               # store
    parameters in a dict object with names
        74                                               # like p0,
    p1, p2, etc. so they can be used in
        75                                               # the Expr
    object.
        76         4           21      5.2      0.0      for i in
    xrange(len(p)):
        77         3           19      6.3      0.0          k =
    'p'+str(i)
        78         3           14      4.7      0.0          uv[k] = p[i]
        79
        80                                               # systematic
    shift on b is a polynomial in a
        81         1            4      4.0      0.0      db = 'p0 *
    a*a  +  p1 * a  +  p2'
        82
        83                                               # the
    element-wise function
        84         1            6      6.0      0.0      fn_str = '(a
    - (b + %s))**2' % db
        85
        86         1        16427  16427.0      1.0      expr =
    Expr(fn_str,uservars=uv)
        87         1        21438  21438.0      1.3      expr.eval()
        88
        89                                               # returning
    the "sum of squares"
        90         1      1594600 1594600.0     97.7      return
    sum(expr)




    On Mon, May 14, 2012 at 1:59 PM, Johann Goetz <jgo...@ucla.edu
    <mailto:jgo...@ucla.edu>> wrote:

        SHORT VERSION:

        Please take a look at the fn() function in the attached file
        (pasted below). When I run this with 10M events or more I
        notice that the total CPU usage never goes above the
        percentage I get using single-threaded eval(). Am I at some
        other limit or can I improve performance by doing something else?

        LONG VERSION:

        I have been trying to use the tables.Expr object to speed up
        a sophisticated calculation over an entire dataset (a
        pytables Table object). The calculation took so long that I
        had to create a simple example to make sure I knew what I was
        doing. I apologize in advance for the lengthy code below, but
        I wanted the example to mimic exactly what I'm trying to do
        and to be totally self-contained.

        I have attached a file (and pasted it below) in which I
        create a hdf5 file with a single large Table of two columns.
        As you can see, I'm not worried about writing speed at all -
        I'm concerned about read speed.

        I would like to draw your attention to the fn() function.
        This is where I evaluate a "chi-squared" value on the
        dataset. My strategy is to populate the
        "h5table.colinstances" dict object with several parameters
        which I call p0, p1, etc and then create the Expr object
        using these and the column names from the Table.

        If I create 10M rows (77 MB file) in the Table (with the
        command below), the evaluation seems to be CPU bound (one of
        my cores is at 100% - the others are idle) and it takes about
        7 seconds (about 10 MB/s). Similarly, I get about 70 seconds
        for 100M events.

        python pytables_expr_test.py 10000000
        python pytables_expr_test.py 100000000

        So my question:  It seems to me that I am not fully using the
        CPU power available on my computer (see next paragraph). Am I
        missing something or doing something wrong in the fn()
        function below?

        A few side-notes: My hard-disk is capable of over 200 MB/s in
        sequential reading (sustained and tested with large files
        using the iozone program), I have two 4-core CPU's on this
        machine but the total CPU usage during eval() never goes
        above the percentage I get using single-threaded mode with
        "numexpr.set_num_threads(1)".

        I am using pytables 2.3.1 and numexpr 2.0.1

--Johann T. Goetz, PhD.

        <http://sites.google.com/site/theodoregoetz/>
        jgo...@ucla.edu <mailto:jgo...@ucla.edu>
        Nefkens Group, UCLA Dept. of Physics & Astronomy
        Hall-B, Jefferson Lab, Newport News, VA


        ### BEGIN file: pytables_expr_test.py

        from tables import openFile, Expr

        ### Control of the number of threads used when issuing the
        ### Expr::eval() command
        #import numexpr
        #numexpr.set_num_threads(2)

        def create_ntuple_file(filename, npoints, pmodel):
            '''
                create an hdf5 file with a single table which contains
                npoints number of rows of type row_t (defined below)
            '''
            from numpy import random, poly1d
            from tables import IsDescription, Float32Col

            class row_t(IsDescription):
                '''
                    the rows of the table to be created
                '''
                a = Float32Col()
                b = Float32Col()

            def append_row(h5row, pmodel):
                '''
                    consider this a single "event" being appended
                    to the dataset (table)
                '''
                h5row['a'] = random.uniform(0,10)

                h5row['b'] = h5row['a'] # reality (or model)
                h5row['b'] = h5row['b'] - poly1d(pmodel)(h5row['a'])
        # systematics
                h5row['b'] = h5row['b'] + random.normal(0,0.1) # noise

                h5row.append()

            h5file = openFile(filename, 'w')
            h5table = h5file.createTable('/', 'table', row_t, "Data")
            h5row = h5table.row

            # recording data to file...
            for n in xrange(npoints):
                append_row(h5row, pmodel)

            h5file.close()

        def create_ntuple_file_if_needed(filename, npoints, pmodel):
            '''
                looks to see if the file is already there and if so,
                it makes sure its the right size. Otherwise, it
                removes the existing file and creates a new one.
            '''
            from os import path, remove

            print 'model parameters:', pmodel

            if path.exists(filename):
                h5file = openFile(filename, 'r')
                h5table = h5file.root.table
                if len(h5table) != npoints:
                    h5file.close()
                    remove(filename)

            if not path.exists(filename):
                create_ntuple_file(filename, npoints, pmodel)

        def fn(p, h5table):
            '''
                actual function we are going to minimize. It consists of
                the pytables Table object and a list of parameters.
            '''
            uv = h5table.colinstances

            # store parameters in a dict object with names
            # like p0, p1, p2, etc. so they can be used in
            # the Expr object.
            for i in xrange(len(p)):
                k = 'p'+str(i)
                uv[k] = p[i]

            # systematic shift on b is a polynomial in a
            db = 'p0 * a*a  +  p1 * a  +  p2'

            # the element-wise function
            fn_str = '(a - (b + %s))**2' % db

            expr = Expr(fn_str,uservars=uv)
            expr.eval()

            # returning the "sum of squares"
            return sum(expr)

        if __name__ == '__main__':
            '''
            usage:
                python pytables_expr_test.py [npoints]

            Hint: try this with 10M points
            '''
            from sys import argv
            from time import time

            npoints = 1000000
            if len(argv) > 1:
                npoints = int(argv[1])

            filename = 'tmp.'+str(npoints)+'.hdf5'

            pmodel = [-0.04,0.002,0.001]

            print 'creating file (if it doesn\'t exist)...'
            create_ntuple_file_if_needed(filename, npoints, pmodel)

            h5file = openFile(filename, 'r')
            h5table = h5file.root.table

            print 'evaluating function'
            starttime = time()
            print fn([0.,0.,0.], h5table)
            print 'evaluated file in',time()-starttime,'seconds.'

        #EOF


        
------------------------------------------------------------------------------
        Live Security Virtual Conference
        Exclusive live event will cover all the ways today's security and
        threat landscape has changed and how IT managers can respond.
        Discussions
        will include endpoint security, mobile security and the
        latest in malware
        threats.
        http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
        _______________________________________________
        Pytables-users mailing list
        Pytables-users@lists.sourceforge.net
        <mailto:Pytables-users@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/pytables-users




    
------------------------------------------------------------------------------
    Live Security Virtual Conference
    Exclusive live event will cover all the ways today's security and
    threat landscape has changed and how IT managers can respond. Discussions
    will include endpoint security, mobile security and the latest in malware
    threats.http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


    _______________________________________________
    Pytables-users mailing list
    Pytables-users@lists.sourceforge.net  
<mailto:Pytables-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/pytables-users

--Francesc Alted



    
------------------------------------------------------------------------------
    Live Security Virtual Conference
    Exclusive live event will cover all the ways today's security and
    threat landscape has changed and how IT managers can respond.
    Discussions
    will include endpoint security, mobile security and the latest in
    malware
    threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
    _______________________________________________
    Pytables-users mailing list
    Pytables-users@lists.sourceforge.net
    <mailto:Pytables-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/pytables-users




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users



--
Francesc Alted

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] Expr performance with Tables on multicore machines

Reply via email to