Re: [go-nuts] Re: Large 2D slice performance question

2016-12-09 Thread Michael Jones
! From: <golang-nuts@googlegroups.com> on behalf of Mandolyte <cecil@gmail.com> Date: Friday, December 9, 2016 at 7:46 AM To: golang-nuts <golang-nuts@googlegroups.com> Cc: <adono...@google.com> Subject: [go-nuts] Re: Large 2D slice performance question This wor

[go-nuts] Re: Large 2D slice performance question

2016-12-09 Thread Mandolyte
This worked out well. I was able to "materialize" over 1200 trees in under an hour. Pretty amazing. I didn't end up using the DenseSet code. Your quick and dirty version only handled cycles starting from start value. But other than that the concept worked out very well. I did try a version

[go-nuts] Re: Large 2D slice performance question

2016-12-01 Thread Mandolyte
Wow, this will take some time to digest. Regrettably I have required training today and won't be able to even play with this until tomorrow. In my (parent,child), there are 304K unique parent values and 1.05M unique child values. Of course many child values are also parent values. Thus, total

[go-nuts] Re: Large 2D slice performance question

2016-12-01 Thread Egon
See whether this works better: https://gist.github.com/egonelbre/d94ea561c3e63db009718e227e506b5b *There is a lot of room for improvements, (e.g. for your actual dataset increase defaultNameCount).* *PS: Avoid csv package for writing files, use bufio and Write directly... csv does some extra

[go-nuts] Re: Large 2D slice performance question

2016-11-30 Thread Mandolyte
Thanks for the discussion! Package with tester is at: https://github.com/mandolyte/TableRecursion While I can't share the data, I could take a sample set of paths for a root node, reverse engineer the pairs, and obfuscate... I've done this sort of thing before but it is a bit of work. So I'll

[go-nuts] Re: Large 2D slice performance question

2016-11-30 Thread Mandolyte
The finite set idea might work, but the set is well over 300K. The strings (part numbers) are not regular. I could make a single pass over the "parent" column and record in a map[string]int the index of the first occurrence. Then I would avoid sort.Search() having to find it each time. Or use

Re: [go-nuts] Re: Large 2D slice performance question

2016-11-30 Thread Mandolyte
;golan...@googlegroups.com > > *Subject: *[go-nuts] Re: Large 2D slice performance question > > > > On Wednesday, 30 November 2016 03:37:55 UTC+2, Mandolyte wrote: > > I have a fairly large 2D slice of strings, about 115M rows. These are > (parent, child) pairs that,

Re: [go-nuts] Re: Large 2D slice performance question

2016-11-30 Thread Michael Jones
6 at 6:31 AM To: golang-nuts <golang-nuts@googlegroups.com> Subject: [go-nuts] Re: Large 2D slice performance question On Wednesday, 30 November 2016 03:37:55 UTC+2, Mandolyte wrote: I have a fairly large 2D slice of strings, about 115M rows. These are (parent, child) pairs that, processed recurs

[go-nuts] Re: Large 2D slice performance question

2016-11-30 Thread adonovan via golang-nuts
> > On Wednesday, 30 November 2016 03:37:55 UTC+2, Mandolyte wrote: >> >> I have a fairly large 2D slice of strings, about 115M rows. These are >> (parent, child) pairs that, processed recursively, form a tree. I am >> "materializing" all possible trees as well as determining for each root >>

[go-nuts] Re: Large 2D slice performance question

2016-11-30 Thread Egon
On Wednesday, 30 November 2016 03:37:55 UTC+2, Mandolyte wrote: > > I have a fairly large 2D slice of strings, about 115M rows. These are > (parent, child) pairs that, processed recursively, form a tree. I am > "materializing" all possible trees as well as determining for each root > node all