Re: Apache MADlib v1.12 status
Hi Ed, We have not been able to reproduce https://issues.apache.org/jira/browse/MADLIB-1091 so it may move out. I still have some docs updates to do so that will be a coming PR probably Tues or Wed. Frank On Mon, Aug 14, 2017 at 3:30 PM, Ed Espinowrote: > MADlib dev, > > We are winding down the number of outstanding issues for the Apache MADlib > v1.12 release. The one outstanding issue is > https://issues.apache.org/jira/browse/MADLIB-1091. Once this is resolved, > I'm hoping to start the release process. > > Regards, > -=e > > -- > *Ed Espino* >
[GitHub] incubator-madlib issue #167: Update RELEASE_NOTES for v1.12 release
Github user fmcquillan99 commented on the issue: https://github.com/apache/incubator-madlib/pull/167 Here are some suggested changes/additions: 1) Change release date to Fri Aug 18 which might be a better estimate. 2) MLP Change New Module: Multilayer Perceptron (MADLIB-413) to New Module: Multilayer Perceptron (MADLIB-413, MADLIB-1134) 3) APSP Change New module: Graph - All Pairs Shortest Path (MADLIB-1099) to New module: Graph - All Pairs Shortest Path (MADLIB-1072, MADLIB-1099, MADLIB-1106) 4) WCC Change New module: Graph - Weakly Connected Components (MADLIB-1071, MADLIB-1083) to New module: Graph - Weakly Connected Components (MADLIB-1071, MADLIB-1083, MADLIB-1101) 5) Summary Change Summary: Allow user to determine the number of columns per run (MADLIB-1117) to Summary: - Allow user to determine the number of columns per run (MADLIB-1117) - Improve efficiency of computation time by ~35% (MADLIB-1104) 6) TLP Updates for Apache Top Level Project readiness (MADLIB-1130, MADLIB-1133) * what about MADLIB-1132 and MADLIB-1142 * also add the epic MADLIB-1112 7) Train-test split Add: New Module: Sample - Train-test split (MADLIB-1119) 8) under the bugs section: Change - Fix the data scaling bug with normalization to - Fix the data scaling bug with normalization (MADLIB-1094) 9) under the bugs section: change: Update 'optimizer' GUC only if editable to Update 'optimizer' GUC only if editable (MADLIB-1109) 10) Change Promote cardinality estimators to top level module from early stage to Promote cardinality estimators to top level module from early stage (MADLIB-1120) 11) Under bugs section: change Graph: Quoted output table name does not work for some modules to Graph: Quoted output table name does not work for some modules (MADLIB-1137) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: MADLIB-1103 --> v2.0 (thoughts)
On Mon, Aug 14, 2017 at 12:12 PM, Ed Espinowrote: > https://issues.apache.org/jira/browse/MADLIB-1103 (Remove pyxb GPL > workaround) is dependent on the release of PyXB 1.2.6 (which is currently > not scheduled). I'm inclined to move it to v2.0 and we can revisit at a > later point. Thoughts? Makes sense to me! Thanks, Roman.
[GitHub] incubator-madlib pull request #166: Sample: test_train_split
Github user orhankislal commented on a diff in the pull request: https://github.com/apache/incubator-madlib/pull/166#discussion_r133094253 --- Diff: src/ports/postgres/modules/sample/test_train_split.py_in --- @@ -0,0 +1,311 @@ +# coding=utf-8 +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import plpy +from utilities.control import MinWarning +from utilities.utilities import _assert +from utilities.utilities import extract_keyvalue_params +from utilities.utilities import add_postfix +from utilities.utilities import unique_string +from utilities.utilities import split_quoted_delimited_str +from utilities.validate_args import table_exists +from utilities.validate_args import columns_exist_in_table +from utilities.validate_args import table_is_empty +from utilities.validate_args import get_expr_type +from utilities.validate_args import get_cols +from graph.graph_utils import _check_groups +from graph.graph_utils import _grp_from_table + +m4_changequote(` ') + + +def _get_sql_string(str): +if str: +return "'" + str + "'" +return "NULL" + +def test_train_split(schema_madlib, source_table, output_table, train_proportion, + test_proportion, grouping_cols, target_cols, with_replacement, + separate_output_tables, **kwargs): +""" +test train split function +Args: +@param source_table Input table name. +@param output_table Output table name. +@param proportion The ratio of sample size to the number of --- End diff -- proportion should be replaced by train_proportion and test proportion --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-madlib pull request #166: Sample: test_train_split
Github user orhankislal commented on a diff in the pull request: https://github.com/apache/incubator-madlib/pull/166#discussion_r133094303 --- Diff: src/ports/postgres/modules/sample/test_train_split.py_in --- @@ -0,0 +1,311 @@ +# coding=utf-8 +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import plpy +from utilities.control import MinWarning +from utilities.utilities import _assert +from utilities.utilities import extract_keyvalue_params +from utilities.utilities import add_postfix +from utilities.utilities import unique_string +from utilities.utilities import split_quoted_delimited_str +from utilities.validate_args import table_exists +from utilities.validate_args import columns_exist_in_table +from utilities.validate_args import table_is_empty +from utilities.validate_args import get_expr_type +from utilities.validate_args import get_cols +from graph.graph_utils import _check_groups +from graph.graph_utils import _grp_from_table + +m4_changequote(` ') + + +def _get_sql_string(str): +if str: +return "'" + str + "'" +return "NULL" + +def test_train_split(schema_madlib, source_table, output_table, train_proportion, + test_proportion, grouping_cols, target_cols, with_replacement, + separate_output_tables, **kwargs): +""" +test train split function +Args: +@param source_table Input table name. +@param output_table Output table name. +@param proportion The ratio of sample size to the number of +records. +@param grouping_cols(Default: NULL) The columns to distinguish +each strata. +@param target_cols (Default: NULL) The columns to include in +the output. +@param with_replacement (Default: FALSE) The sampling method. + --- End diff -- Missing parameter: separate_output_tables --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-madlib pull request #166: Sample: test_train_split
Github user orhankislal commented on a diff in the pull request: https://github.com/apache/incubator-madlib/pull/166#discussion_r133093991 --- Diff: src/ports/postgres/modules/sample/test_train_split.sql_in --- @@ -0,0 +1,321 @@ +/* --- *//** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + * + * @file test_train_split.sql_in + * + * @brief SQL functions for test train split. + * @date 07/19/2017 + * + * @sa Given a table, test train split returns a proportion of records + * for each group (strata). + * + *//* --- */ + +m4_include(`SQLCommon.m4') + + +/** +@addtogroup grp_test_train_split + +Contents + +test train split +Examples + + + +@brief A method for independently sampling subpopulations (strata). + +test train split is a method for independently sampling --- End diff -- The explanation should be modified for the test-train functionality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
JIRA for migrating repos following MADlib's TLP graduation
Hi All, I have opened an Apache Infrastructure ticket to migrate MADlib's git repos, distribution server, and other common tasks associated with the move from incubator to TLP. The ticket is: https://issues.apache.org/jira/browse/INFRA-14872 Please do have a look at it and let me know if I have missed something, or if something is to be changed. I followed the instructions at http://www.apache.org/dev/infra-contact#requesting-graduation to open the ticket, and used the template used by Apache Flex's TLP ticket https://issues.apache.org/jira/browse/INFRA-5688. I will keep you posted on the status of the ticket. We might still need to change some settings in MADlib's Jenkins build, once the git repo move is finished. I thought that was something we could control and might not need Infra's help for that (please correct me if I am wrong). NJ
Apache MADlib v1.12 status
MADlib dev, We are winding down the number of outstanding issues for the Apache MADlib v1.12 release. The one outstanding issue is https://issues.apache.org/jira/browse/MADLIB-1091. Once this is resolved, I'm hoping to start the release process. Regards, -=e -- *Ed Espino*
MADLIB-1103 --> v2.0 (thoughts)
https://issues.apache.org/jira/browse/MADLIB-1103 (Remove pyxb GPL workaround) is dependent on the release of PyXB 1.2.6 (which is currently not scheduled). I'm inclined to move it to v2.0 and we can revisit at a later point. Thoughts? -=e -- *Ed Espino*
Re: Jira post v1.12 version?
Thanks Frank. I have moved them to v2.0. The main reason why I am interested in these issues is IMHO they tie directly to easing the dev user community adoption (lowers bar of entry - newer gcc versions supported). -=e On Mon, Aug 14, 2017 at 12:04 PM, Frank McQuillanwrote: > Ed, > > I would suggest v2.0 for the next version, so you can add those 2 JIRAs to > v2.0 > > Once we get v1.12 out the door I was going to solicit comments from the > community on v2.0 features so we can get that backlog going. > > Frank > > On Mon, Aug 14, 2017 at 11:30 AM, Ed Espino wrote: > > > Dev, > > > > What are we setting the Jira Fix Version/s for issues to be addressed in > > the next release (post v1.12)? I noticed a v2.0 version (06/Oct/17) > > available in Jira. > > > > The two issues I'd like to set to the next release are the following: > > > > https://issues.apache.org/jira/browse/MADLIB-1025 - MADlib does not > > compile > > with gcc 6.2 > > https://issues.apache.org/jira/browse/MADLIB-1145 - Ubuntu 16.04 - Using > > GCC 5 (default gcc) causes Postgres 9.6 crash > > > > Any guidance is greatly appreciated. > > > > Regards > > -=e > > > > -- > > *Ed Espino* > > > -- *Ed Espino*
[GitHub] incubator-madlib issue #167: Update RELEASE_NOTES for v1.12 release
Github user asfgit commented on the issue: https://github.com/apache/incubator-madlib/pull/167 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/155/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Jira post v1.12 version?
Ed, I would suggest v2.0 for the next version, so you can add those 2 JIRAs to v2.0 Once we get v1.12 out the door I was going to solicit comments from the community on v2.0 features so we can get that backlog going. Frank On Mon, Aug 14, 2017 at 11:30 AM, Ed Espinowrote: > Dev, > > What are we setting the Jira Fix Version/s for issues to be addressed in > the next release (post v1.12)? I noticed a v2.0 version (06/Oct/17) > available in Jira. > > The two issues I'd like to set to the next release are the following: > > https://issues.apache.org/jira/browse/MADLIB-1025 - MADlib does not > compile > with gcc 6.2 > https://issues.apache.org/jira/browse/MADLIB-1145 - Ubuntu 16.04 - Using > GCC 5 (default gcc) causes Postgres 9.6 crash > > Any guidance is greatly appreciated. > > Regards > -=e > > -- > *Ed Espino* >
Re: [VOTE]: MADlib repo(s) migration
1 On Fri, Aug 11, 2017 at 10:16 AM, Nandish Jayaramwrote: > Hi All, > > A gentle reminder to vote if you'd like. I was thinking of opening the > Apache Infra > ticket for the move sometime today if there are no more votes to come. > > NJ > > On Thu, Aug 10, 2017 at 3:39 AM, ChenLiang Wang > wrote: > > > 1 > > > > On 08/10/2017 05:47 AM, Orhan Kislal wrote: > > > 1 > > > > > > Orhan Kislal > > > > > > On Wed, Aug 9, 2017 at 2:32 PM, Nandish Jayaram > > wrote: > > > > > >> Hi All, > > >> > > >> With MADlib's graduation to TLP, it's time to migrate its github > > >> repos from `*incubator-madlib*` to `*madlib*`. We will have to open > > >> an Apache Infrastructure ticket to request this move for the following > > >> repos (along with other stuff like wiki, jenkins etc): > > >> https://git1-us-west.apache.org/repos/asf?p=incubator-madlib.git > > >> (Read/Write) > > >> https://github.com/apache/incubator-madlib (Github mirror- read only) > > >> https://git1-us-west.apache.org/repos/asf?p=incubator-madlib-site.git > > >> https://github.com/apache/incubator-madlib-site (GitHub mirror) > > >> > > >> There are two ways to go about this, and the Infra ticket has to be > > >> raised accordingly. > > >> 1) Just maintain the current set-up, but have the repos renamed from > > >> incubator-madlib to madlib. > > >> 2) Use Gitbox to enable github repo as a R/W repo and not just > > read-only. > > >> Check this email ( > > >> https://mail-archives.apache.org/mod_mbox/incubator-madlib- > > >> dev/201708.mbox/%3cCA+ULb+vP0ViWH4Nc=4eaXvbT0KOmeFtQzp4eAa3p0fKPP7c > > >> 8...@mail.gmail.com%3e) > > >> for further information. > > >> > > >> Please vote you preference and we can decide to move accordingly. > > >> > > >> NJ > > >> > > > > > >
[GitHub] incubator-madlib pull request #167: Update RELEASE_NOTES for v1.12 release
GitHub user orhankislal opened a pull request: https://github.com/apache/incubator-madlib/pull/167 Update RELEASE_NOTES for v1.12 release You can merge this pull request into a Git repository by running: $ git pull https://github.com/orhankislal/incubator-madlib release/rel_notes_1.12 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-madlib/pull/167.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #167 commit 3965e6cbbd0bff312c116a80c87dba0214e6d876 Author: Orhan KislalDate: 2017-08-14T18:43:29Z Update RELEASE_NOTES for v1.12 release --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Jira post v1.12 version?
Dev, What are we setting the Jira Fix Version/s for issues to be addressed in the next release (post v1.12)? I noticed a v2.0 version (06/Oct/17) available in Jira. The two issues I'd like to set to the next release are the following: https://issues.apache.org/jira/browse/MADLIB-1025 - MADlib does not compile with gcc 6.2 https://issues.apache.org/jira/browse/MADLIB-1145 - Ubuntu 16.04 - Using GCC 5 (default gcc) causes Postgres 9.6 crash Any guidance is greatly appreciated. Regards -=e -- *Ed Espino*
[GitHub] incubator-madlib issue #166: Sample: test_train_split
Github user asfgit commented on the issue: https://github.com/apache/incubator-madlib/pull/166 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/154/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2
Github user asfgit closed the pull request at: https://github.com/apache/incubator-madlib/pull/162 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---