[GitHub] madlib issue #256: Minibatch Preprocessing: change default buffer size formu...

2018-04-10 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/256 LGTM Default selection looks reasonable: (0) data DROP TABLE IF EXISTS iris_data; CREATE TABLE iris_data( id serial, attributes numeric[], class_t

[GitHub] madlib issue #256: Minibatch Preprocessing: change default buffer size formu...

2018-04-06 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/256 Oh I see, with the averaging approach: buffer_size = avg_num_rows_per_segment / num_segments = 21.5 / 2 = 10.75 and rounding up w

[GitHub] madlib issue #256: Minibatch Preprocessing: change default buffer size formu...

2018-04-06 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/256 Is this expected behavior? last group for NJ gets only 1 observation ``` DROP TABLE IF EXISTS iris_data; CREATE TABLE iris_data( id serial, attributes numeric[],

[GitHub] madlib issue #256: Minibatch Preprocessing: change default buffer size formu...

2018-04-05 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/256 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/425/ ---

[GitHub] madlib issue #256: Minibatch Preprocessing: change default buffer size formu...

2018-04-05 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/256 We seem to be computing batch size using master but prob should just consider num segments. ---

[GitHub] madlib issue #256: Minibatch Preprocessing: change default buffer size formu...

2018-04-05 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/256 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/424/ ---