Hi Bryan,

I haven't seen this appear on the list yet. Are you sure you sent from
the address you are subscribed with? I've been caught by that one
before elsewhere.

Thanks for the feedback. And David, Chris, and Mark too.

Bryan, nice to find someone working on a similar problem. I'll check
out your work. At first glace your ($indices,$values) pairs might be
similar to how I have addressed the problem up to now. Not within PDL,
but as a regular Perl hash.

Your logarithmic search time sounds interesting. I get linear time
searching for word pairs. Frankly I was glad to get it down from
exponential time in the size of the vectors. It may not be the same
search we are talking about. I'm searching for pairs between every
element of a vector, and every element of another.

However I do it though my memory usage just explodes when I try a
cross product. Too many intermediate products being summed, nesting
two or three levels of substitutions over 10k+ dimensions. I need to
keep all intermediate products in RAM at once or it is way too slow.
But with it all in RAM the memory usage blows out.

-Rob

P.S. Mark, I'm not sure if you're suggesting you manage to reduce your
multi-dimensional problem down to one dimension, or that you can get
rid of uncertainty by using multi-dimensions.

On Thu, Sep 29, 2011 at 5:30 PM, Bryan Jurish <[email protected]> wrote:
> morning all,
>
> Apologies for barging in on the discussion (this being my first post to
> this list); just saw one of my buzzwords and thought I might have
> something to contribute (and of course a chance for a shameless plug)... ;-)
>
> On 2011-09-28 16:04, chm wrote:
>> On 9/28/2011 8:15 AM, David Mertens wrote:
>>> Hey Rob,
>>>
>>> I've CC'd the PDL list in case somebody there can speak more to your
>>> concerns about PDL. I've never heard of anybody needing 10s of
>>> thousands of
>>> dimensions, and PDL might only support 256. Anybody know? As far as
>>> sparse
>>> matrix support, I never used it and it's not a crowning feature. Can
>>> anybody
>>> else speak more to the matter?
>
> PDL's sparse matrix support is basically non-existent, afaik.  There's a
> proof-of-concept implementation floating around somewhere in the core I
> looked at a few years ago, but iirc any kind of operation at all on a
> sparse piddle will blow it up into a dense one (or at least returns a
> dense pdl, which often amounts to the same thing as far as memory
> overflow is concerned).
>
> I also needed large sparse pdls in my work (very sparse 3d word
> co-occurrence matrices on the order of 50000**3), and wound up rolling
> my own bastard implementation out of PDL::PP and perl hackery; the
> module is available here:
>
>  http://search.cpan.org/~moocow/PDL-CCS-1.14/CCS/Nd.pm
>
> The basic idea is to store a sparse N-dimensional pdl as a pair
> ($indices,$values), where $indices is a lexicographically sorted list of
> index-vectors (a la whichND(), indexND()), and $values are the
> corresponding (non-zero) values.  Index operations are logarithmic time
> with vsearchvec(), and basic arithmetic operations and ufuncs can be
> twiddled with some block-wise processing heuristics.  Abstract matrix
> multiplication is hairy, and when it comes down to it these things are
> perl objects and not "real" pdls in their own right, so a lot of pdl's
> functionality gets lost.  It might be useful for the word-matrix
> application though...
>
> That having been said, I'd love to see some "real" sparse n-d matrix
> support in the pdl core; I'm just not sure to what degree the
> assumptions implicit in the core code might prevent an efficient sparse
> matrix implementation (e.g. explicit iteration over all possible values,
> implicit assumption of dense output pdls, etc.)
>
> marmosets,
>        Bryan
>
> --
> Bryan Jurish                           "There is *always* one more bug."
> [email protected]           -Lubarsky's Law of Cybernetic Entomology

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to