On Thu, Jan 3, 2013 at 2:17 PM, David Reed <david.ree...@gmail.com> wrote:
> Thanks a lot for the help so far guys!
>
> Looking at itertools, I found what I believe to be the perfect function
> for what I need, itertools.combinations. This appears to be a valid
> replacement to the method proposed.
>
Yes, combinations is awesome!
>
> There is a small problem that I didn't mention is that my compare function
> actually takes as inputs 2 columns from the table. Like so:
>
> D = np.empty((N_irises, N_irises))
> for ii in xrange(N_elements):
> for jj in xrange(ii+1, N_elements):
> D[ii, jj] = compare(data['element1'][ii],
> data['element1'][jj],data['element2'][ii],
> data['element2'][jj])
>
> Is there an efficient way of using itertools with this structure?
>
You can always make two other iterators for each column. Since you have
two columns you would have 4 iterators. I am not sure how fast this is
going to be but I am confident that there is definitely a way to do this in
one for-loop, which is going to be way faster than nested loops.
Be Well
Anthony
>
>
> On Thu, Jan 3, 2013 at 1:29 PM, <
> pytables-users-requ...@lists.sourceforge.net> wrote:
>
>> Send Pytables-users mailing list submissions to
>> pytables-users@lists.sourceforge.net
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> or, via email, send a message with subject or body 'help' to
>> pytables-users-requ...@lists.sourceforge.net
>>
>> You can reach the person managing the list at
>> pytables-users-ow...@lists.sourceforge.net
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Pytables-users digest..."
>>
>>
>> Today's Topics:
>>
>> 1. Re: Nested Iteration of HDF5 using PyTables (Josh Ayers)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Thu, 3 Jan 2013 10:29:33 -0800
>> From: Josh Ayers <josh.ay...@gmail.com>
>> Subject: Re: [Pytables-users] Nested Iteration of HDF5 using PyTables
>> To: Discussion list for PyTables
>> <pytables-users@lists.sourceforge.net>
>> Message-ID:
>> <
>> cacob4anozyd7dafos7sxs07mchzb8zbripbbrvbazrv4weq...@mail.gmail.com>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> David,
>>
>> The change in issue 27 was only for iteration over a tables.Column
>> instance. To use it, tweak Anthony's code as follows. This will iterate
>> over the "element" column, as in your original example.
>>
>> Note also that this will only work with the development version of
>> PyTables
>> available on github. It will be very slow using the released v2.4.0.
>>
>>
>> from itertools import izip
>>
>> with tb.openFile(...) as f:
>> data = f.root.data.cols.element
>> data_i = iter(data)
>> data_j = iter(data)
>> data_i.next() # throw the first value away
>> for i, j in izip(data_i, data_j):
>> compare(i, j)
>>
>>
>> Hope that helps,
>> Josh
>>
>>
>>
>> On Thu, Jan 3, 2013 at 9:11 AM, Anthony Scopatz <scop...@gmail.com>
>> wrote:
>>
>> > HI David,
>> >
>> > Tables and table column iteration have been overhauled fairly recently
>> > [1]. So you might try creating two iterators, offset by one, and then
>> > doing the comparison. I am hacking this out super quick so please
>> forgive
>> > me:
>> >
>> > from itertools import izip
>> >
>> > with tb.openFile(...) as f:
>> > data = f.root.data
>> > data_i = iter(data)
>> > data_j = iter(data)
>> > data_i.next() # throw the first value away
>> > for i, j in izip(data_i, data_j):
>> > compare(i, j)
>> >
>> > You get the idea ;)
>> >
>> > Be Well
>> > Anthony
>> >
>> > 1. https://github.com/PyTables/PyTables/issues/27
>> >
>> >
>> > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <david.ree...@gmail.com>
>> wrote:
>> >
>> >> I was hoping someone could help me out here.
>> >>
>> >> This is from a post I put up on StackOverflow,
>> >>
>> >> I am have a fairly large dataset that I store in HDF5 and access using
>> >> PyTables. One operation I need to do on this dataset are pairwise
>> >> comparisons between each of the elements. This requires 2 loops, one to
>> >> iterate over each element, and an inner loop to iterate over every
>> other
>> >> element. This operation thus looks at N(N-1)/2 comparisons.
>> >>
>> >> For fairly small sets I found it to be faster to dump the contents
>> into a
>> >> multdimensional numpy array and then do my iteration. I run into
>> problems
>> >> with large sets because of memory issues and need to access each
>> element of
>> >> the dataset at run time.
>> >>
>> >> Putting the elements into an array gives me about 600 comparisons per
>> >> second, while operating on hdf5 data itself gives me about 300
>> comparisons
>> >> per second.
>> >>
>> >> Is there a way to speed this process up?
>> >>
>> >> Example follows (this is not my real code, just an example):
>> >>
>> >> *Small Set*:
>> >>
>> >>
>> >> with tb.openFile(h5_file, 'r') as f:
>> >> data = f.root.data
>> >>
>> >> N_elements = len(data)
>> >> elements = np.empty((N_irises, 1e5))
>> >>
>> >> for ii, d in enumerate(data):
>> >> elements[ii] = data['element']
>> >>
>> >> D = np.empty((N_irises, N_irises)) for ii in xrange(N_elements):
>> >> for jj in xrange(ii+1, N_elements):
>> >> D[ii, jj] = compare(elements[ii], elements[jj])
>> >>
>> >> *Large Set*:
>> >>
>> >>
>> >> with tb.openFile(h5_file, 'r') as f:
>> >> data = f.root.data
>> >>
>> >> N_elements = len(data)
>> >>
>> >> D = np.empty((N_irises, N_irises))
>> >> for ii in xrange(N_elements):
>> >> for jj in xrange(ii+1, N_elements):
>> >> D[ii, jj] = compare(data['element'][ii],
>> data['element'][jj])
>> >>
>> >>
>> >>
>> >>
>> ------------------------------------------------------------------------------
>> >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>> >> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>> >> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>> >> MVPs and experts. ON SALE this month only -- learn more at:
>> >> http://p.sf.net/sfu/learnmore_122712
>> >> _______________________________________________
>> >> Pytables-users mailing list
>> >> Pytables-users@lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >>
>> >>
>> >
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>> > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>> > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>> > MVPs and experts. ON SALE this month only -- learn more at:
>> > http://p.sf.net/sfu/learnmore_122712
>> > _______________________________________________
>> > Pytables-users mailing list
>> > Pytables-users@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >
>> >
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>>
>> ------------------------------
>>
>>
>> ------------------------------------------------------------------------------
>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>> MVPs and experts. ON SALE this month only -- learn more at:
>> http://p.sf.net/sfu/learnmore_122712
>>
>> ------------------------------
>>
>> _______________________________________________
>> Pytables-users mailing list
>> Pytables-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>>
>>
>> End of Pytables-users Digest, Vol 80, Issue 3
>> *********************************************
>>
>
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. ON SALE this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122712
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122712
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users