Re: [VOTE] MADlib v1.17.0-rc2

2020-04-08 Thread Nandish Jayaram
[+1] binding.

NJ

>
>
> -- Forwarded message -
> From: Orhan Kislal 
> Date: Mon, Apr 6, 2020 at 4:49 PM
> Subject: [VOTE] MADlib v1.17.0-rc2
> To: , 
>
>
> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
> - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
> - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
> - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
> - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
> - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
> - DL: Preprocessor should evenly distribute data on an arbitrary number
> of segments (MADLIB-1378)
> - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
> - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
> - DL: Support DL predict without training on MADlib (MADLIB-1359)
> - DL: Transfer learning for multi-model (MADLIB-1389)
> - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
> - Kmeans: Select number of centroids in k-means (MADLIB-1380)
> - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
> - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
> - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
> - DL: helper function for asymmetric cluster config (MADLIB-1390)
> - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
> - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
> - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
> - DL: performance improvements to fit transition function (MADLIB-1418)
> - Docs: Enhance Installation Guides (MADLIB-1399)
> - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
> - Knn - add zero check and output distance array (MADLIB-1370)
> - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
> - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
> - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
> - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
> - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
> - DL: Exception during madlib_keras_fit when model_arch_id is passed as
> NULL (MADLIB-1371)
> - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
> - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
> - DL: Intermediate tables are not dropped  (MADLIB-1404)
> - DL: MADlib Keras operations create too many threads (MADLIB-1372)
> - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
> - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
> - DL: Remove final function for fit multiple (MADLIB-1416)
> - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
> - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
> - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
> - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
> - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
> - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
> - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
> - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
> - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
> - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
> - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> 

Re: building madlib

2020-02-12 Thread Nandish Jayaram
Hi Anton,

Thank you for trying out MADlib. I might not know the exact answer to your
question, but folks on the dev mailing list (CCed) should be able to help
you out.

Regards
Nandish

On Fri, Feb 7, 2020 at 11:01 AM Anton Kirillov 
wrote:

> Hello Nandish!
>
> I'm trying to build MADlib 1.16 with Greenplum 6.3. I've reviewed madlib
> repo, greenplum repo and pivotal repo and found no clear instructions.
>
> I see there are a docker containers:
>
>- https://hub.docker.com/r/madlib/ubuntu18-postgres-gpdb
>- https://hub.docker.com/r/madlib/centos7-postgres-gpdb
>
> which contain a result of madlib building process. But without Dockerfile.
>
> Could you please help me: where I can find Dockerfiles for this containers?
> Or maybe you know the author and his contacts?
>
> Thanks!
>
> Anton
>


Re: [VOTE] MADlib v1.16-rc2

2019-07-03 Thread Nandish Jayaram
[+1] binding

Thank you Domino.
I installed the dmg on a GPDB5 cluster and ran install-check and dev-check
successfully.

NJ

On Wed, Jul 3, 2019 at 10:06 AM Frank McQuillan 
wrote:

> + u...@madlib.apache.org
>
> Thank you for reposting the release artifacts.
>
> tested PostgreSQL 11.4 on x86_64-apple-darwin16.7.0
>
> passed install check, dev check and spot tests on functionality
>
> +1 (binding)
>
> Frank
>
> On Tue, Jul 2, 2019 at 5:01 PM Domino Valdano  wrote:
>
>> [Resending with amended Subject line]
>>
>> Hello Apache MADlib community,
>>
>> This is the vote for Apache MADlib 1.16 Release (RC2). It provides the
>> source release tarball and convenience binaries.
>>
>> Changes from RC1:
>>
>>- Extraneous files removed from DMG package
>>- Support for gpdb6 removed from RPM & DEB packages
>>- gpdb5 RPM package built on Centos 7 instead of Centos 6
>>- RPM package now compiled with gcc 4.8 instead of gcc 6.2
>>
>> The vote will run for at least 72 hours and will close on Saturday
>> July 6, 2019 @ 0:00 UTC (Friday July 5, 2019 @ 17:00 PDT).
>> A minimum of 3 binding +1 votes and more binding +1 than
>> binding -1 are required to pass.
>>
>> The main goals of this release are:
>>
>> New features:
>> - Deep learning: support for Keras with TensorFlow backend with GPU
>> - acceleration (MADLIB-1268, MADLIB-1304, MADLIB-1305, MADLIB-1307,
>>   MADLIB-1308, MADLIB-1309, MADLIB-1310, MADLIB-1311, MADLIB-1313,
>>   MADLIB-1314, MADLIB-1315, MADLIB-1316, MADLIB-1319, MADLIB-1321,
>>   MADLIB-1326, MADLIB-1330, MADLIB-1335, MADLIB-1336, MADLIB-1338,
>>   MADLIB-1343, MADLIB-1348, MADLIB-1349, MADLIB-1350, MADLIB-1356,
>>   MADLIB-1357, MADLIB-1358, MADLIB-1324, MADLIB-1337, MADLIB-1347,
>>   MADLIB-1363)
>> - Deep learning: utility to load model architectures and weights
>>   (MADLIB-1306)
>> - Deep learning: preprocess images for gradient descent optimization
>>   algorithms (MADLIB-1290, MADLIB-1332, MADLIB-1334, MADLIB-1300,
>>   MADLIB-1303)
>> - kd-tree method for k-nearest neighbors for faster approximate
>>   solution (MADLIB-1061, MADLIB-1293)
>> - Support for Greenplum 6 (MADLIB-1298)
>> - Support for PostgreSQL 11 (MADLIB-1283)
>>
>> Bug fixes:
>> - Jaccard distance not releasing memory (MADLIB-1291)
>> - MLP with minibatching fails on postgres (MADLIB-1302)
>> - MLP does not stop even after tolerance reached (MADLIB-1325)
>> - MLP warm start not working (MADLIB-1329)
>> - MLP with minibatch fails for integer dependent variable on
>> PostgreSQL
>>   (MADLIB-1322)
>> - MLP fix column name in output table (MADLIB-1323)
>> - Pivot: Fix array_agg + distinct scaling issue on gpdb (MADLIB-1361)
>> - linregr_train fails when dependent variable is a JSONB element
>>   (MADLIB-1284)
>> - MADLib 1.15 does not recognize Postgres 10 declarative partitioned
>>   table (MADLIB-1287)
>> - Encoding module is not handling bigint properly (MADLIB-1295)
>> - SVM class_weight param not working properly (MADLIB-1346)
>>
>> Other:
>> - Simplify maintenance via removing online examples from sql functions
>>   (MADLIB-1260)
>> - Improve performance for weakly connected components (MADLIB-1320)
>> - SVD minor messaging inprovements (MADLIB-983)
>> - Create SQL scripts to get lists of changed UDOs and UDOCs
>>   (MADLIB-1281)
>> - Set max itemset size to 10 by default in assoc rules (MADLIB-1288)
>> - Misc messages for 1.16 release (MADLIB-1364)
>> - Madlib 1.16 release tasks (MADLIB-1362)
>>
>> 1.16 docs available here:
>> http://madlib.apache.org/docs/rc/index.html
>>
>> For additional information, please see:
>> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.16
>>
>> Here are the release artifact details:
>>
>> Source release tag to be voted on: rc/1.16-rc2, located here:
>>
>> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.16-rc2
>>
>> Source release tarball can be retrieved from the following locations:
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.16.RC2/apache-madlib-1.16-src.tar.gz
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.16.RC2/apache-madlib-1.16-src.tar.gz.asc
>> SHA512
>> 
>> Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.16.RC2/apache-madlib-1.16-src.tar.gz.sha512
>>
>> Convenience binary packages can be retrieved from the following
>> locations:
>>
>> macOS: 10.12 GPDB 5.*, PostgreSQL 10 & 11
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.16.RC2/apache-madlib-1.16-bin-Darwin.dmg
>> PGP Signature:
>> https://dist.apache.org/repos/dist/dev/madlib/1.16.RC2/apache
>> -madlib-1.16-bin-Darwin.dmg.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.16.RC2/apache-madlib-1.16-bin-Darwin.dmg.sha512
>>
>> 

Re: [VOTE] MADlib v1.16-rc1

2019-06-28 Thread Nandish Jayaram
Hi,

My vote is a [-1].

I tried installing the dmg for GPDB 5, and had a couple of dev-check
failures. Please find the details below:
1) I actually don't see utilities/minibatch_preprocessing_dl in the master
branch.
```
TEST CASE RESULT|Module:
utilities|minibatch_preprocessing_dl.sql_in|FAIL|Time: 334 milliseconds
madpack.py: ERROR : Failed executing
/tmp/madlib.Wdj2IU/utilities/minibatch_preprocessing_dl.sql_in.tmp
madpack.py: ERROR : Check the log at
/tmp/madlib.Wdj2IU/utilities/minibatch_preprocessing_dl.sql_in.log
```
The error message in the log is:
```
SELECT minibatch_preprocessor_dl(
  'minibatch_preprocessor_dl_input',
  'minibatch_preprocessor_dl_batch',
  'id',
  'x',
  5);
psql:/tmp/madlib.Wdj2IU/utilities/minibatch_preprocessing_dl.sql_in.tmp:55:
ERROR:  AttributeError: 'module' object has no attribute
'MiniBatchPreProcessorDL' (plpython.c:5038)
CONTEXT:  Traceback (most recent call last):
  PL/Python function "minibatch_preprocessor_dl", line 23, in 
minibatch_preprocessor_obj =
minibatch_preprocessing.MiniBatchPreProcessorDL(**globals())
PL/Python function "minibatch_preprocessor_dl"
```

2) Deep learning has a debug.sql_in, which might be there by mistake.
```
TEST CASE RESULT|Module: deep_learning|debug.sql_in|FAIL|Time: 33
milliseconds
madpack.py: ERROR : Failed executing
/tmp/madlib.Wdj2IU/deep_learning/debug.sql_in.tmp
madpack.py: ERROR : Check the log at
/tmp/madlib.Wdj2IU/deep_learning/debug.sql_in.log
```
The error message in the log is:
```
psql:/tmp/madlib.Wdj2IU/deep_learning/debug.sql_in.tmp:7: ERROR:  function
dummy() does not exist
LINE 1: select dummy();
   ^
HINT:  No function matches the given name and argument types. You might
need to add explicit type casts.
```

NJ

On Thu, Jun 27, 2019 at 4:00 PM Domino Valdano  wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.16 Release (RC1). It provides the
> source release tarball and convenience binaries.
>
> The vote will run for at least 72 hours and will close on Tuesday,
> July 2, 2019 @ 23:00 UTC (16:00 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features:
> - Deep learning: support for Keras with TensorFlow backend with GPU
> - acceleration (MADLIB-1268, MADLIB-1304, MADLIB-1305, MADLIB-1307,
>   MADLIB-1308, MADLIB-1309, MADLIB-1310, MADLIB-1311, MADLIB-1313,
>   MADLIB-1314, MADLIB-1315, MADLIB-1316, MADLIB-1319, MADLIB-1321,
>   MADLIB-1326, MADLIB-1330, MADLIB-1335, MADLIB-1336, MADLIB-1338,
>   MADLIB-1343, MADLIB-1348, MADLIB-1349, MADLIB-1350, MADLIB-1356,
>   MADLIB-1357, MADLIB-1358, MADLIB-1324, MADLIB-1337, MADLIB-1347,
>   MADLIB-1363)
> - Deep learning: utility to load model architectures and weights
>   (MADLIB-1306)
> - Deep learning: preprocess images for gradient descent optimization
>   algorithms (MADLIB-1290, MADLIB-1332, MADLIB-1334, MADLIB-1300,
>   MADLIB-1303)
> - kd-tree method for k-nearest neighbors for faster approximate
>   solution (MADLIB-1061, MADLIB-1293)
> - Support for Greenplum 6 (MADLIB-1298)
> - Support for PostgreSQL 11 (MADLIB-1283)
>
> Bug fixes:
> - Jaccard distance not releasing memory (MADLIB-1291)
> - MLP with minibatching fails on postgres (MADLIB-1302)
> - MLP does not stop even after tolerance reached (MADLIB-1325)
> - MLP warm start not working (MADLIB-1329)
> - MLP with minibatch fails for integer dependent variable on PostgreSQL
>   (MADLIB-1322)
> - MLP fix column name in output table (MADLIB-1323)
> - Pivot: Fix array_agg + distinct scaling issue on gpdb (MADLIB-1361)
> - linregr_train fails when dependent variable is a JSONB element
>   (MADLIB-1284)
> - MADLib 1.15 does not recognize Postgres 10 declarative partitioned
>   table (MADLIB-1287)
> - Encoding module is not handling bigint properly (MADLIB-1295)
> - SVM class_weight param not working properly (MADLIB-1346)
>
> Other:
> - Simplify maintenance via removing online examples from sql functions
>   (MADLIB-1260)
> - Improve performance for weakly connected components (MADLIB-1320)
> - SVD minor messaging inprovements (MADLIB-983)
> - Create SQL scripts to get lists of changed UDOs and UDOCs
>   (MADLIB-1281)
> - Set max itemset size to 10 by default in assoc rules (MADLIB-1288)
> - Misc messages for 1.16 release (MADLIB-1364)
> - Madlib 1.16 release tasks (MADLIB-1362)
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.16
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.16-rc1, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.16-rc1
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> 

Re: [VOTE] MADlib v1.15.1-rc1

2018-10-13 Thread Nandish Jayaram
+1 Binding
Built from source on Ubuntu. Was able to create .deb binary too using `make
package`. Installing MADlib using the generated .deb binary was successful
(along with install-check).

NJ

On Sat, Oct 13, 2018 at 11:17 AM Nikhil Kak  wrote:

> +1  Tested using the .deb file on gpdb 5.11 Ubuntu 18.04.1. dev-check and
> install-check ran successfully.
>
> On Fri, Oct 12, 2018 at 5:44 PM Domino Valdano 
> wrote:
>
>> +1
>>
>> On Fri, Oct 12, 2018 at 9:04 AM FENG, Xixuan (Aaron) <
>> xixuan.f...@gmail.com> wrote:
>>
>>> +1 (binding)
>>> Dev-checked in Ubuntu Postgres 10
>>>
>> 2018年10月11日(木) 8:53 Marshall Presser :
>>>
>> +1 (binding)

 On Wed, Oct 10, 2018 at 6:48 AM Orhan Kislal 
 wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.15.1 Release (RC1). It provides
> the
> source release tarball and convenience binaries.
>
> The vote will run for at least 72 hours and will close on Saturday,
> October 13th, 2018 @ 14:00 GMT+3 (04:00 PDT). A minimum of 3 binding +1
> votes and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features:
>
> - Add ubuntu support for MADlib (MADLIB-1256).
> - Elastic Net: Add grouping by non-numeric column support
> (MADLIB-1262).
> - KNN: Accept expressions for point_column_name and
> test_column_name (MADLIB-1060).
> - Vec2Cols: Allow arrays of different lengths (MADLIB-1270).
> - Madpack: Add a script for automating changelist creation.
>
> Bug fixes:
>
> - Allocator: Remove 16-byte alignment in GPDB 6.
> - Build: Download compatible Boost if version >= 1.65
> (MADLIB-1235).
> - Build: Remove primary key constraint in IC/DC.
> - CMake: Fix false positive for Postgres 10+ check.
> - Graph: Add id of nodes with 0 in-degree (MADLIB-1279).
> - Margins: Copy summary table instead of renaming (MADLIB-1276).
> - MLP: Simplify momentum and Nesterov updates (MADLIB-1272).
> - Upgrade: Fix issue with upgrading RPM to 1.15.1 (MADLIB-1278).
> - Utilities: Use plpy.quote_ident if available.
>
> Other:
>
> - Simplify maintenance via removing online examples from sql
> functions (MADLIB-1260).
> - Re-enable PCA and PageRank tests (MADLIB-1264).
> - Build: Disable AppendOnly if available (MADLIB-1273).
> - Improve documentation of various modules.
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.15.1
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.15.1-rc1, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.15.1-rc1
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-src.tar.gz.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.* PostgreSQL 9.6 & 10.2
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Darwin.dmg.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Darwin.dmg.sha512
>
> CentOS* GPDB 4.3.5+
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Linux-GPDB43.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 &* GPDB 5.3.0, PostgreSQL 9.6 & 10.2
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Linux.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Linux.rpm.sha512
>
> Ubuntu 16.04 GPDB 5.11, PostgreSQL 9.6 & 10.5
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.15.1.RC1/apache-madlib-1.15.1-bin-Linux.deb
> PGP Signature:
>
> 

New Committer: Nikhil Kak

2018-06-27 Thread Nandish Jayaram
Dear MADlib dev community,

The Project Management Committee (PMC) for Apache MADlib
has invited Nikhil to become a committer and we are pleased
to announce that he has accepted.

Nikhil started working on the project in Nov 2017.  Since that
time, he has contributed significantly in both features and bug
fixes in the following areas:

- mini-batch preprocessor
- utilities
- neural networks/multilayer perceptron
- correlation
- HITS graph algorithm
- support vector machines
- LDA
- infrastructure projects
- documentation
- testing

Being a committer enables easier contribution to the
project since there is no need to go via the patch
submission process. This should enable better productivity.

Welcome Nikhil!

Regards,
The Apache MADlib PMC


Re: [VOTE] MADlib v1.14-rc1

2018-04-27 Thread Nandish Jayaram
Hmm, the .asc links issue seems to be a local issue on my environment then.

NJ

On Fri, Apr 27, 2018 at 4:45 PM, Nikhil Kak <n...@pivotal.io> wrote:

> I was able to successfully upgrade from madlib 1.12 to madlib 1.14 on gpdb
> 5.7 centos 6.
>
> I also ran install check after the upgrade and everything passed.
> My vote: +1
>
> The link to all the `.asc` files in the email you sent is broken. I think
> > the word 'SHA512' from the next line is getting included in the URL from
> > the previous line for some reason.
>
> All the links worked fine for me except the one mentioned by Rashmi.
>
> Thanks,
> Nikhil Kak
>
>
> On Fri, Apr 27, 2018 at 4:23 PM Nandish Jayaram <njaya...@pivotal.io>
> wrote:
>
> > Thank you for being the release manager Jingyi. I used the rpm to upgrade
> > from MADlib 1.13 to 1.14 on Greenplum 5. Works great!
> > +1 (binding)
> >
> > One issue, that does not affect my vote:
> > - The link to all the `.asc` files in the email you sent is broken. I
> > think the word 'SHA512' from the next line is getting included in the URL
> > from the previous line for some reason.
> >
> > NJ
> >
> > On Fri, Apr 27, 2018 at 3:19 PM, Rashmi Raghu <rra...@pivotal.io> wrote:
> >
> >> Installed on Postgres 9.6 on MacOS using dmg.
> >> Checked out the new additions to the summary function. Looks good. My
> >> vote: +1 (binding).
> >>
> >> Some comments aside from the vote:
> >>
> >>- I followed this link in the email: https://cwiki.apache.
> >>org/confluence/display/MADLIB/MADlib+1.14
> >><https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14> and
> >>then from there clicked on https://dist.apache.org/
> >>repos/dist/release/madlib/1.14/
> >><https://dist.apache.org/repos/dist/release/madlib/1.14/> which
> gives
> >>a page-not-found error.
> >>- I didn't see a link to documentation associated with this release -
> >>it would be useful to have that also available (let me know if it
> was in
> >>the email and I missed it or if it is not standard practice). For
> instance,
> >>I wanted to briefly look at the new balanced datasets module and it
> would
> >>have been easy to look it up in the web version of the docs. I did
> find the
> >>docs through the function call e.g. madlib.balance_sample('usage')
> but that
> >>requires knowing roughly what function name to look for (not hard in
> this
> >>case but I can imagine other situations where it might not be
> >>straightforward)
> >>
> >> Great to see all the new features and bug fixes!
> >>
> >> Thanks,
> >> Rashmi
> >>
> >>
> >> On Fri, Apr 27, 2018 at 1:40 PM, Orhan Kislal <okis...@pivotal.io>
> wrote:
> >>
> >>> Tested on PG 10.3 (src and dmg). Looks good. +1 (binding)
> >>>
> >>> Thanks for preparing the release Jingyi,
> >>>
> >>> Orhan Kislal
> >>>
> >>> On Fri, Apr 27, 2018 at 11:44 AM, Frank McQuillan <
> fmcquil...@pivotal.io
> >>> > wrote:
> >>>
> >>>> Hi Jingyi,
> >>>>
> >>>> Thanks for posting the artifacts and sending out the vote.
> >>>>
> >>>> My findings:
> >>>>
> >>>> Installation and IC passed on postgres 9.6.7
> >>>>
> >>>> Also I tested a cpl of the new features (personalized page rank and
> >>>> mini-batch preprocessor)
> >>>> and they worked OK for me with a small sample data set.
> >>>>
> >>>> +1 (binding)
> >>>>
> >>>> On Thu, Apr 26, 2018 at 2:57 PM, Jingyi Mei <j...@pivotal.io> wrote:
> >>>>
> >>>> > Hello Apache MADlib dev community,
> >>>> >
> >>>> > This is the vote for Apache MADlib 1.14 Release (RC1). It provides
> the
> >>>> > source release tarball and convenience binaries. This is the third
> >>>> > Apache MADlib release as an Apache Top Level Project (TLP).
> >>>> >
> >>>> > The vote will run for at least 72 working hours and will close on
> >>>> > Tuesday, May 1st, 2018 @ 6pm PDT. A minimum of 3 binding +1 votes
> and
> >>>> > more binding +1 than binding -1 are required to pass.
> >>>> >
> >>>> > The main goals of th

Re: Can anyone help me tag madlib 1.14 rc1?

2018-04-25 Thread Nandish Jayaram
Hey Jingyi,

Thank you for being the release manager. I am happy to help. Feel free to
ping me.

NJ

On Wed, Apr 25, 2018 at 5:27 PM, Jingyi Mei  wrote:

> Hi there,
>
> I am on the way of releasing MADlib 1.14 and I need to tag release
> candidate on MADlib git repo with my own code signing key ID. Can any
> committer help me with that?
>
> Thank you very much!
>
> Jingyi Mei
>


PyXB version in MADlib github page

2017-12-07 Thread Nandish Jayaram
Hi All,

The README page in MADlib's github page lists PyXB-1.2.4
in Third Party Components. Should it be changed to PyXB-1.2.6 now?

NJ


Re: Apache MADlib v1.13 release manager volunteer

2017-11-16 Thread Nandish Jayaram
+1.
Thanks Ed, it was great working with you a bit last time, would be happy do
it again!

NJ

On Thu, Nov 16, 2017 at 9:35 AM, Orhan Kislal  wrote:

> Thanks for volunteering Ed. You have my +1. I would be happy to help
> whenever you need a committer level access.
>
> Best wishes,
>
> Orhan Kislal
>
> On Thu, Nov 16, 2017 at 8:54 AM, Ed Espino  wrote:
>
>> MADlibers,
>>
>> If one hasn't been selected thus far, I would like to volunteer for the
>> release manager role for the upcoming Apache MADlib v1.13 release. In
>> addition, I will be focusing on improving the user experience around the
>> "madpack" utility.
>>
>> If allowed to serve in this role, I will need a small bit of github
>> assistance working from MADlib committers. I will reach out at the
>> appropriate time.
>>
>> Please let me know your thoughts on my release manager volunteer request.
>>
>> Regards,
>> -=e
>>
>> --
>> *Ed Espino*
>>
>
>


Re: Performance of array_dot vs cosine_similarity, continued

2017-10-23 Thread Nandish Jayaram
Thank you for the further analysis James. This certainly looks like
something
to change, in MADlib-2.0. Since 2.0 is a backward compatibility breaking
release,
it's just easier to have the same function name and just change its
implementation.
But more importantly, we will have ample time to do some serious performance
testing, since array_dot is used internally by other MADlib modules such as
SVM
and SVD to name a few.

Given the performance impact this particular change could have, I vote to
make
the change. Since it would also incur lots of performance testing, I would
vote to
include this in the 2.0 release.

NJ

On Fri, Oct 20, 2017 at 1:59 PM, James Gregory <james@gmail.com> wrote:

> On 20 October 2017 at 18:43, Nandish Jayaram <njaya...@pivotal.io> wrote:
> > Thank you for following up on that JIRA James. Based on some more code
> > exploration, it looks like we should be able to replace the native
> > implementation
> > of array_dot() with Eigen's dot() function. array_dot() currently takes
> in
> > `anyarray`
> > as you pointed out, and cosine_similarity() takes in double precision
> > arrays.
> >
> > - But I was able to run cosine_similarity() on int[], float8[] and double
> > precision[]
> > vector pairs without any issues.
> > - I also checked that the current array_dot() returns a float8, and not
> > the type of the input arrays, while cosine_similarity() returns a double.
> > - Internally in MADlib, a few modules (GLM, SVM, SVD, matrix_ops, and
> > conjugate
> > gradient) use the array_dot() function, and they too should not be
> affected
> > by
> > this change.
> >
> > So it looks like there might not be any backward compatibility breaking
> > changes if we replace the native array_dot() with Eigen's dot().
> >
>
> Testing locally, eigen array_dot is much faster for doubles, but
> normal array_dot is a bit faster for float4. I don't have enough
> knowledge of the internals of either postgres or madlib to say exactly
> why this is. Maybe postgres is casting float4[] to float8[] when
> calling postgres functions defined as taking doubles, or maybe
> postgres itself doesn't cast but rather then internals of array_ops.c
> are written in such a way as to be faster for float4 than the
> internals of Eigen.
>
> But it seems that even if switching out the implementation totally
> isn't actually a breaking change, it would cause a slight performance
> degradation for people not using double precision.
>


Re: Jenkins infra is not stable for MADlib PR

2017-09-19 Thread Nandish Jayaram
There is another Jenkins related failure:
https://builds.apache.org/user/riyer/my-views/view/MADlib-Monitor/job/madlib-master-build/98/console

[ERROR] The build could not read 1 project -> [Help 1][ERROR]
[ERROR]   The project
(/home/jenkins/jenkins-slave/workspace/madlib-master-build/incubator-madlib/pom.xml)
has 1 error[ERROR] Non-readable POM
/home/jenkins/jenkins-slave/workspace/madlib-master-build/incubator-madlib/pom.xml:
/home/jenkins/jenkins-slave/workspace/madlib-master-build/incubator-madlib/pom.xml
(No such file or directory)[ERROR] [ERROR] To see the full stack trace
of the errors, re-run Maven with the -e switch.[ERROR] Re-run Maven
using the -X switch to enable full debug logging.[ERROR] [ERROR] For
more information about the errors and possible solutions, please read
the following articles:[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException



This seems to be different from what Jingyi pointed out. Any workaround for
this,
or should we raise a ticket instead?

NJ

On Tue, Sep 19, 2017 at 2:15 PM, Ed Espino  wrote:

> Thanks for the info Roman.
>
> For the "madlib-pr-build" and "madlib-master-build" Apache Jenkins jobs and
> to exclude the problematic agent "qnode3", I have changed the "Label
> Expression" value for the "Restrict where this project can be run" option
> from "ubuntu" to "ubuntu && ! qnode3".
>
> Cheers,
> -=e
>
>
> On Tue, Sep 19, 2017 at 1:58 PM, Roman Shaposhnik 
> wrote:
>
> > A quick solution is to disable the offending host in your Jenkins
> > configuration.
> >
> > Thanks,
> > Roman.
> >
> > On Tue, Sep 19, 2017 at 12:15 PM, Jingyi Mei  wrote:
> > > Hi Ed,
> > >
> > > Thanks for referring to the INFRA issue. Seems there is no quick
> solution
> > > for that yet. And since I kept getting the same error for my PR, I am
> > going
> > > to create another issue specifically for unblocking my build.
> > >
> > > Thank you very much!
> > > Jingyi
> > >
> > > On Mon, Sep 18, 2017 at 8:23 PM, Ed Espino  wrote:
> > >
> > >> Hey Jingyi,
> > >>
> > >> Welcome to the project. These types of issues do happen from time to
> > time.
> > >> There is a recent Infrastructure issue which references the same
> system
> > >> that your MADlib PR build hit.
> > >>
> > >> Jira: INFRA-14979
> > >> Summary: Jenkins jobs on qnode3 are failing with "No space left on
> > device"
> > >> Jira link: https://issues.apache.org/jira/browse/INFRA-14979
> > >>
> > >> If the problem persists, please file an Infrastructure (INFRA) ticket
> > so it
> > >> can be addressed. There is a good chance it impacts other projects as
> > well.
> > >>
> > >> Thanks,
> > >> -=e
> > >>
> > >>
> > >> On Mon, Sep 18, 2017 at 2:25 PM, Jingyi Mei  wrote:
> > >>
> > >> > Hi developers,
> > >> >
> > >> > Recently, I kept getting build failure on Jenkins for infra reasons,
> > >> error
> > >> > message is as following:
> > >> >
> > >> > ---
> > >> >
> > >> > FATAL: Unable to produce a script file
> > >> > java.io.IOException: No space left on device
> > >> > at java.io.FileOutputStream.writeBytes(Native Method)
> > >> > at java.io.FileOutputStream.write(FileOutputStream.java:
> 326)
> > >> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:
> > 221)
> > >> > at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:
> 316)
> > >> > at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149)
> > >> > at java.io.OutputStreamWriter.
> close(OutputStreamWriter.java:
> > 233)
> > >> > at hudson.FilePath$17.invoke(FilePath.java:1380)
> > >> > at hudson.FilePath$17.invoke(FilePath.java:1363)
> > >> > at hudson.FilePath$FileCallableWrapper.call(
> > FilePath.java:2739)
> > >> > at hudson.remoting.UserRequest.
> perform(UserRequest.java:153)
> > >> > at hudson.remoting.UserRequest.perform(UserRequest.java:50)
> > >> > at hudson.remoting.Request$2.run(Request.java:336)
> > >> > at hudson.remoting.InterceptingExecutorService$1.call(
> > >> > InterceptingExecutorService.java:68)
> > >> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > >> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > >> > ThreadPoolExecutor.java:1142)
> > >> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > >> > ThreadPoolExecutor.java:617)
> > >> > at java.lang.Thread.run(Thread.java:745)
> > >> > at ..remote call to qnode3(Native Method)
> > >> > at hudson.remoting.Channel.attachCallSiteStackTrace(
> > >> > Channel.java:1545)
> > >> > at hudson.remoting.UserResponse.
> > retrieve(UserRequest.java:253)
> > >> > at hudson.remoting.Channel.call(Channel.java:830)
> > >> > at hudson.FilePath.act(FilePath.java:986)
> > >> > Caused: java.io.IOException: remote file 

Re: Apache Jenkins madlib-pr-build project updated.

2017-09-05 Thread Nandish Jayaram
Thanks Ed!

NJ

On Tue, Sep 5, 2017 at 9:41 PM, Ed Espino  wrote:

> I updated the Apache Jenkins madlib-pr-build build job (updated the clone
> working directory madlib --> madlib-pr). This appears to have fixed the
> permission denied issue we have been plagued by recently. Please keep an
> eye on your PR build checks.
>
> https://builds.apache.org/job/madlib-pr-build/
>
> Regards,
> -=e
>
> --
> *Ed **Espino*
>


MADlib git repos move

2017-08-31 Thread Nandish Jayaram
Hi All,

The Apache infra team has very kindly removed 'incubator'
from our git repos and other Apache resources:
https://issues.apache.org/jira/browse/INFRA-14872

As developers, the consequence of this is that we will have to
update our forked repo information accordingly. Please go ahead
and update your own forked repo name. For instance, I have updated
my repo name to https://github.com/njayaram2/madlib. To make this
change, please follow the instructions at:
https://help.github.com/articles/renaming-a-repository/

Further, you have to update the repo names in your local repositories
too. Your own forks, and also MADlib's upstream repos have to be
changed. Running `git remote -v` shows all the remote repos you have
locally, and their URLs. For instance, I used to have:
> git remote -v
njayaram2 https://github.com/njayaram2/incubator-madlib.git (fetch)
njayaram2 https://github.com/njayaram2/incubator-madlib.git (push)
upstream
https://njaya...@git-wip-us.apache.org/repos/asf/incubator-madlib.git (push)
upstream
https://njaya...@git-wip-us.apache.org/repos/asf/incubator-madlib.git
(fetch)
origin https://github.com/apache/incubator-madlib.git (fetch)
origin https://github.com/apache/incubator-madlib.git (push)

Run the following commands to update this information:
> git remote set-url njayaram2 https://github.com/njayaram2/madlib.git
> git remote set-url upstream
https://njaya...@git-wip-us.apache.org/repos/asf/madlib.git
> git remote set-url origin https://github.com/apache/madlib.git
> git remote -v
njayaram2 https://github.com/njayaram2/madlib.git (fetch)
njayaram2 https://github.com/njayaram2/madlib.git (push)
upstream https://njaya...@git-wip-us.apache.org/repos/asf/madlib.git (push)
upstream https://njaya...@git-wip-us.apache.org/repos/asf/madlib.git (fetch)
origin https://github.com/apache/madlib.git (fetch)
origin https://github.com/apache/madlib.git (push)

Similar changes must be made to the other MADlib git repos, if you have
them:
1) https://git1-us-west.apache.org/repos/asf?p=incubator-madlib-site.git
2) https://github.com/apache/incubator-madlib-site (Mirror)
These are now changed to:
1) https://git1-us-west.apache.org/repos/asf?p=madlib-site.git
2) https://github.com/apache/madlib-site

This seems like all the changes we will have to make as developers,
please add on to this list if you find any other changes we should be
making.

NJ


Unable to add release info for 'madlib' in Apache reporter database

2017-08-29 Thread Nandish Jayaram
Hi All,

Now that the 1.12 release vote is done and we are
preparing to make the release, I received an email
saying I must update the release date and version
in the Apache reporter database at:
https://reporter.apache.org/addrelease.html?madlib

But, I am unable to update the latest version information
there even though I am a MADlib PMC member. Any ideas
how to do it?

NJ


Re: [VOTE] Apache MADlib 1.12 Release (RC1)

2017-08-28 Thread Nandish Jayaram
Upgraded from MADlib 1.9.1 to 1.12 on GPDB 4.3.12, and also made a fresh
MADlib installation on the same platform successfully.

+1 Binding

NJ

On Fri, Aug 25, 2017 at 8:21 PM, Rashmi Raghu  wrote:

> Installed and tested install-check on GPDB 4.3.12 using the RPM. Also, ran
> a couple of new algorithms (connected components & bfs) on a small dataset.
>
> +1 Binding
>
> Thanks,
> Rashmi
>
>
> On Fri, Aug 25, 2017 at 6:12 PM, Srivatsan Ramanujam  >
> wrote:
>
> > Installed and tested install-check on *Postgres 9.6* on *OSX 10.11.6* (El
> > Capitan) using the dmg file.
> >
> > +1 Binding
> >
> > Thanks
> > Vatsan
> >
> > On Thu, Aug 24, 2017 at 4:58 PM, Orhan Kislal 
> wrote:
> >
> > > Installed and tested install-check on GPDB 4.3.11 using the RPM.
> > >
> > > +1 Binding
> > >
> > > Thanks,
> > >
> > > Orhan
> > >
> > > On Thu, Aug 24, 2017 at 3:41 PM, Ed Espino  wrote:
> > >
> > > > +1 (non-binding as I'm not a PMC member)
> > > >
> > > > PGP signatures validated
> > > > Hashes validated
> > > > Apache Release Audit Tool (RAT) validated
> > > >
> > > > On macOS 10.12.6 (16G29), build from source tarball
> > > > (apache-madlib-1.12-src.tar.gz) and validated with "install-check"
> > > > using PostgreSQL 9.6.4 & 9.5.8
> > > >
> > > > On macOS 10.12.6 (16G29), validated convenience binary
> > > > (apache-madlib-1.12-bin-Darwin.dmg) using PostgreSQL 9.6.4 & 9.5.8
> > > >
> > > > On Redhat 7.2, validated convenience binary
> > > > (apache-madlib-1.12-bin-Linux-GPDB5beta8.rpm) using Greenplum
> Database
> > > > 5 beta 8.
> > > >
> > > > On Redhat 7.2, validated convenience binary
> > > > (apache-madlib-1.12-bin-Linux.rpm) using Greenplum Database
> 4.3.16.1.
> > > >
> > > > Regards,
> > > > -=e
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Aug 24, 2017 at 2:28 PM, Ed Espino 
> wrote:
> > > >
> > > > > I just updated the correct Apache HAWQ 1.12.RC1 convenience binary
> > > > > apache-madlib-1.12-bin-Linux.rpm. All other files are in good
> > > standing.
> > > > > Please reverify this binary package, pgp signature and hashes.
> Sorry
> > > for
> > > > > the confusion.  Thanks for pointing this out Orhan.
> > > > >
> > > > > Regards,
> > > > > -=e
> > > > >
> > > > > On Thu, Aug 24, 2017 at 1:36 PM, Orhan Kislal 
> > > > wrote:
> > > > >
> > > > >> Hi, I was testing the RPM on GPDB 4.3.11 and it seems the folder
> it
> > > > >> creates
> > > > >> is named 1.12-dev. I don't see it in the code, the cause might be
> > > > reusing
> > > > >> an old build folder after the version name is changed from
> 1.12-dev
> > to
> > > > >> 1.12. In addition, it does not have support for the platform. I
> see
> > > the
> > > > >> GPDB 5 folder under ports so the names might have been mixed.
> > > > >>
> > > > >> Orhan Kislal
> > > > >>
> > > > >> On Thu, Aug 24, 2017 at 12:04 PM, Ed Espino 
> > > wrote:
> > > > >>
> > > > >> > FYI: You may find this useful when validating the release
> > > artifacts. I
> > > > >> have
> > > > >> > used a variation of the following script to retrieve release
> > > > artifacts,
> > > > >> > verify PGP signatures and check hashes (md5 and sha512). I run
> > this
> > > on
> > > > >> my
> > > > >> > macbook pro with the utilities available via homebrew. I have
> > pasted
> > > > the
> > > > >> > script below as well as the output from running it.
> > > > >> >
> > > > >> > Regards,
> > > > >> > -=e
> > > > >> >
> > > > >> > #! /bin/sh
> > > > >> > ## 
> > > > >> --
> > > > >> > ## Retrieve release artifacts, verify pgp signatures and check
> > > hashes.
> > > > >> > ##
> > > > >> > ## Most utilities are available via homebrew packages:
> > > > >> > ##   gnupg
> > > > >> > ##   gpg2
> > > > >> > ##   coreutils
> > > > >> > ## 
> > > > >> --
> > > > >> >
> > > > >> > files="
> > > > >> > https://dist.apache.org/repos/dist/dev/madlib/1.12.RC1/
> > > > >> > apache-madlib-1.12-bin-Darwin.dmg
> > > > >> >
> > > > >> > https://dist.apache.org/repos/dist/dev/madlib/1.12.RC1/
> > > > >> > apache-madlib-1.12-bin-Darwin.dmg.asc
> > > > >> >
> > > > >> > https://dist.apache.org/repos/dist/dev/madlib/1.12.RC1/
> > > > >> > apache-madlib-1.12-bin-Darwin.dmg.md5
> > > > >> >
> > > > >> > https://dist.apache.org/repos/dist/dev/madlib/1.12.RC1/
> > > > >> > apache-madlib-1.12-bin-Darwin.dmg.sha512
> > > > >> >
> > > > >> > https://dist.apache.org/repos/dist/dev/madlib/1.12.RC1/
> > > > >> > apache-madlib-1.12-bin-Linux-GPDB5beta8.rpm
> > > > >> >
> > > > >> > https://dist.apache.org/repos/dist/dev/madlib/1.12.RC1/
> > > > >> > apache-madlib-1.12-bin-Linux-GPDB5beta8.rpm.asc
> > > > >> >
> > > > >> > https://dist.apache.org/repos/dist/dev/madlib/1.12.RC1/
> > > > >> > apache-madlib-1.12-bin-Linux-GPDB5beta8.rpm.md5
> > > > >> >
> > 

Re: ASF press release on MADlib TLP graduation

2017-08-22 Thread Nandish Jayaram
Awesome! :)

NJ

On Tue, Aug 22, 2017 at 10:33 AM, Frank McQuillan 
wrote:

> MADlib community,
>
> Received this note below from Sally Khudairi, VP of ASF in charge of
> communications (and other things).It is the ASF press release for the
> MADlib TLP graduation.  Big thanks to Sally for marshaling it thru.
>
> Looking forward to upcoming 1.12 release, check for the vote in the next
> day or 2 from Ed the release manager.
>
> Thank you for your continued contribution to the project.
>
> Frank
>
> -
>
>  - NASDAQ GlobeNewswire https://globenewswire.com/news-release/2017/08
> /22/1090924/0/en/The-Apache-Software-Foundation-Announces-Ap
> ache-MADlib-as-a-Top-Level-Project.html
>  - ASF "Foundation" blog https://s.apache.org/BSrW
>  - @TheASF Twitter feed https://twitter.com/TheASF/
> status/899934272169050112
>  - ASF LinkedIn page linkedin.com/company/the-apache-software-foundation
>
> ...plus sent to announce@ and our dedicated media/analyst list. This will
> appear on the apache.org homepage and archives during the next
> auto-update,
> which should take place within the hour.
>
> Thanks again for all your help, and congratulations on reaching this
> milestone!
>
> Warmly,
> Sally
>