Re: New Committer: Nikhil Kak

2018-06-27 Thread Jingyi Mei
Congrats Nikhil!

On Wed, Jun 27, 2018 at 1:39 PM, Nandish Jayaram 
wrote:

> Dear MADlib dev community,
>
> The Project Management Committee (PMC) for Apache MADlib
> has invited Nikhil to become a committer and we are pleased
> to announce that he has accepted.
>
> Nikhil started working on the project in Nov 2017.  Since that
> time, he has contributed significantly in both features and bug
> fixes in the following areas:
>
> - mini-batch preprocessor
> - utilities
> - neural networks/multilayer perceptron
> - correlation
> - HITS graph algorithm
> - support vector machines
> - LDA
> - infrastructure projects
> - documentation
> - testing
>
> Being a committer enables easier contribution to the
> project since there is no need to go via the patch
> submission process. This should enable better productivity.
>
> Welcome Nikhil!
>
> Regards,
> The Apache MADlib PMC
>


[GitHub] madlib issue #284: SVM: Fix flaky dev-check failure

2018-06-27 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/284
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/528/



---


[GitHub] madlib issue #284: SVM: Fix flaky dev-check failure

2018-06-27 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/284
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/527/



---


[GitHub] madlib issue #284: SVM: Fix flaky dev-check failure

2018-06-27 Thread njayaram2
Github user njayaram2 commented on the issue:

https://github.com/apache/madlib/pull/284
  
Thank you for the comment @iyerr3 , will relax the constraint as suggested.


---


[GitHub] madlib issue #284: SVM: Fix flaky dev-check failure

2018-06-27 Thread iyerr3
Github user iyerr3 commented on the issue:

https://github.com/apache/madlib/pull/284
  
This test isn't necessarily the best thing to check but if we do want to 
keep it with some relaxed constraints then maybe we allow the norm to go higher 
by a small amount iff the total loss is lower. So something along the lines of 
```
norm2(l2.coef) < norm2(noreg.coef) OR
( (norm2(l2.coef)-norm2(noreg.coef))/norm2(noreg.coef) < 0.1 AND 
l2.loss < noreg.loss),
```


---


[GitHub] madlib issue #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/283
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/526/



---


[GitHub] madlib pull request #284: SVM: Fix flaky dev-check failure

2018-06-27 Thread njayaram2
GitHub user njayaram2 opened a pull request:

https://github.com/apache/madlib/pull/284

SVM: Fix flaky dev-check failure

JIRA: MADLIB-1232

SVM has a dev-check query that is flaky on a large cluster. This commit
relaxes the assert condition for that query.

Closes #284

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/madlib/madlib bugfix/svm-flaky-dev-check

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/madlib/pull/284.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #284


commit 9c556816bcca990e9b168cf556ce0da0cacf935a
Author: Nandish Jayaram 
Date:   2018-06-27T19:40:19Z

SVM: Fix flaky dev-check failure

JIRA: MADLIB-1232

SVM has a dev-check query that is flaky on a large cluster. This commit
relaxes the assert condition for that query.

Closes #284




---


[GitHub] madlib pull request #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/madlib/pull/283


---


[GitHub] madlib issue #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread njayaram2
Github user njayaram2 commented on the issue:

https://github.com/apache/madlib/pull/283
  
Thank you for the comments @kaknikhil . I moved out the jenkins build 
script to a different commit.


---


[GitHub] madlib pull request #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request:

https://github.com/apache/madlib/pull/283#discussion_r198641216
  
--- Diff: src/ports/postgres/modules/crf/test/crf_train_large.sql_in ---
@@ -234,26 +234,40 @@ INSERT INTO train_new_segmenttbl VALUES
 (30, 7, 'years', 13, 31),
 (31, 7, '.', 44, 31);
 
-CREATE TABLE train_new_regex(pattern text,name text); 
+CREATE TABLE train_new_regex(pattern text,name text);
 INSERT INTO train_new_regex VALUES
-('^[A-Z][a-z]+$','InitCapital'), ('^[A-Z]+$','isAllCapital'),
+('^[A-Z][a-z]+$','InitCapital'), ('^[A-Z]+$','isAllCapital'),
 ('^.*[0-9]+.*$','containsDigit'),('^.+[.]$','endsWithDot'),
 ('^.+[,]$','endsWithComma'), ('^.+er$','endsWithER'),
 ('^.+est$','endsWithEst'),   ('^.+ed$','endsWithED'),
 ('^.+s$','endsWithS'),   ('^.+ing$','endsWithIng'),
 ('^.+ly$','endsWithly'), 
('^.+-.+$','isDashSeparatedWords'),
 ('^.*@.*$','isEmailId');
-analyze train_new_regex;
+analyze train_new_regex;
 
-SELECT crf_train_fgen('train_new_segmenttbl', 'train_new_regex', 
'crf_label', 'train_new_dictionary', 
'train_new_featuretbl','train_new_featureset');
+CREATE TABLE crf_label_new (id integer,label character varying);
--- End diff --

The two files `crf_test_small.sql_in` and `crf_train_large.sql_in` have 
different indentation. Can we make them consistent


---


[GitHub] madlib pull request #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request:

https://github.com/apache/madlib/pull/283#discussion_r198644242
  
--- Diff: src/ports/postgres/modules/crf/test/crf_test_small.sql_in ---
@@ -90,7 +90,7 @@
 (18,'PRP$'),(19,'RB'), (20,'RBR'),  (21,'RBS'), (22,'RP'), 
(23,'SYM'), (24,'TO'), (25,'UH'), (26,'VB'),
 (27,'VBD'), (28,'VBG'),(29,'VBN'),  (30,'VBP'), 
(31,'VBZ'),(32,'WDT'), (33,'WP'), (34,'WP$'),(35,'WRB'),
 (36,'$'),   (37,'#'),  (38,''), (39,'``'),  (40,'('),  
(41,')'),   (42,','),  (43,'.'),  (44,':');
-   analyze crf_label;
+   analyze test_crf_label;
--- End diff --

Assuming that the table `crf_label` doesn't exist, why wasn't crf install 
check always red? 


---


New Committer: Nikhil Kak

2018-06-27 Thread Nandish Jayaram
Dear MADlib dev community,

The Project Management Committee (PMC) for Apache MADlib
has invited Nikhil to become a committer and we are pleased
to announce that he has accepted.

Nikhil started working on the project in Nov 2017.  Since that
time, he has contributed significantly in both features and bug
fixes in the following areas:

- mini-batch preprocessor
- utilities
- neural networks/multilayer perceptron
- correlation
- HITS graph algorithm
- support vector machines
- LDA
- infrastructure projects
- documentation
- testing

Being a committer enables easier contribution to the
project since there is no need to go via the patch
submission process. This should enable better productivity.

Welcome Nikhil!

Regards,
The Apache MADlib PMC


[GitHub] madlib issue #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/283
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/525/



---


[GitHub] madlib pull request #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread njayaram2
GitHub user njayaram2 opened a pull request:

https://github.com/apache/madlib/pull/283

Bugfix: Fix failing dev check in CRF

This commit has the following changes:
- A couple of dev check files in CRF did not have the label table creation
in it. But the label table was consumed by one of the queries that led
to dev-check failure.
- Run dev check on Jenkins build instead of install check.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/madlib/madlib bugfix/crf-dev-check

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/madlib/pull/283.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #283


commit deb206b7d1c1e7ce87d6e33c7a1dff91b3adb82b
Author: Nandish Jayaram 
Date:   2018-06-27T18:25:46Z

Bugfix: Fix failing dev check in CRF

This commit has the following changes:
- A couple of dev check files in CRF did not have the label table creation
in it. But the label table was consumed by one of the queries that led
to dev-check failure.
- Run dev check on Jenkins build instead of install check.




---


[GitHub] madlib pull request #282: Utilites: Add CTAS while dropping some columns

2018-06-27 Thread iyerr3
GitHub user iyerr3 opened a pull request:

https://github.com/apache/madlib/pull/282

Utilites: Add CTAS while dropping some columns

JIRA: MADLIB-1241

This commit adds function to create a new table from existing table
while dropping some of the columns of the source table.

Closes #282

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/madlib/madlib feature/drop_columns

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/madlib/pull/282.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #282


commit 3b5310c07c2092059cd116c82457d026f968cd7a
Author: Rahul Iyer 
Date:   2018-06-27T00:36:39Z

Utilites: Add CTAS while dropping some columns

JIRA: MADLIB-1241

This commit adds function to create a new table from existing table
while dropping some of the columns of the source table.

Closes #282




---