Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

2016-11-24 Thread Alex Merritt
So it looks like earlier in the process of fixing this for JDBC I broke it
for HBase. Still not quite sure why, but it appears that inserting events
without eventIds is the cause of the deletion. Regardless, I just moved the
event id stripping to JDBCPEvents (to fix insert errors in JDBC). Also
added a test case which fails before this fix.
Committed and pushed. Tests passed locally, Travis is running right now.
Will close the JIRA when I see it complete.

On Wed, Nov 23, 2016 at 11:42 AM, Alex Merritt <leca...@gmail.com> wrote:

> I first took a quick look at the merge, and it looked like the only
> (minor) divergence is in JDBC. And yet, I assume you are using HBase here.
> As was I, when I was later able to reproduce the issue (using
> SelfCleaningDataSourceTest).
>
> Will aim to track down &
> attempt a fix today / tomorrow.
>
> Alex
>
> On Mon, Nov 21, 2016 at 5:16 PM, Alex Merritt <leca...@gmail.com> wrote:
>
>> Sure, I can try to reproduce this / take a look tomorrow.
>>
>> Alex
>>
>> On Nov 21, 2016 12:05 PM, "Pat Ferrel" <p...@occamsmachete.com> wrote:
>>
>>> Do you have time to look at this Alex? I may have made a mistake in
>>> merging this feature. At present any use of it erases all data. Since it is
>>> only used from templates we haven’t had one that used it except your
>>> integration test that should be merged with Apache-PIO. Can you at least
>>> run those to see if the problem is reproducible? Or tell me how to run
>>> those? It’s included in one of the example templates, right?
>>>
>>>
>>> On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <j...@apache.org> wrote:
>>>
>>>
>>> [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlass
>>> ian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>>
>>> Pat Ferrel updated PIO-45:
>>> --
>>>Description:
>>> as integrated into the UR, in the integration-test, the
>>> SelfCleaningDataset erases all data. This feature works fine in the AML
>>> version of PIO.
>>>
>>> Although not tested one could assume that this would be true with any
>>> other Datasource in other templates.
>>>
>>> [~emergentorder] can you check to see if the PIO merge was done
>>> correctly.
>>>
>>>  was:
>>> as integrated into the UR, in the integration-test, the
>>> SelfCleaningDataset erases all data. This feature works fine in the AML
>>> version of PIO.
>>>
>>> Although not tested one could assume that this would be true with any
>>> other Datasource in other templates.
>>>
>>> [~amerritt] can you check to see if the PIO merge was done correctly.
>>>
>>>
>>> > SelfCleaningDatasource erases all data
>>> > --
>>> >
>>> >Key: PIO-45
>>> >URL: https://issues.apache.org/jira/browse/PIO-45
>>> >Project: PredictionIO
>>> > Issue Type: Bug
>>> >   Affects Versions: 0.10.0-incubating
>>> >   Reporter: Pat Ferrel
>>> >   Assignee: Alexander  Merritt
>>> >   Priority: Critical
>>> >Fix For: 0.11.0
>>> >
>>> >
>>> > as integrated into the UR, in the integration-test, the
>>> SelfCleaningDataset erases all data. This feature works fine in the AML
>>> version of PIO.
>>> > Although not tested one could assume that this would be true with any
>>> other Datasource in other templates.
>>> > [~emergentorder] can you check to see if the PIO merge was done
>>> correctly.
>>>
>>>
>>>
>>> --
>>> This message was sent by Atlassian JIRA
>>> (v6.3.4#6332)
>>>
>>>
>


Re: [jira] [Updated] (PIO-45) SelfCleaningDatasource erases all data

2016-11-23 Thread Alex Merritt
I first took a quick look at the merge, and it looked like the only (minor)
divergence is in JDBC. And yet, I assume you are using HBase here.
As was I, when I was later able to reproduce the issue (using
SelfCleaningDataSourceTest).

Will aim to track down &
attempt a fix today / tomorrow.

Alex

On Mon, Nov 21, 2016 at 5:16 PM, Alex Merritt <leca...@gmail.com> wrote:

> Sure, I can try to reproduce this / take a look tomorrow.
>
> Alex
>
> On Nov 21, 2016 12:05 PM, "Pat Ferrel" <p...@occamsmachete.com> wrote:
>
>> Do you have time to look at this Alex? I may have made a mistake in
>> merging this feature. At present any use of it erases all data. Since it is
>> only used from templates we haven’t had one that used it except your
>> integration test that should be merged with Apache-PIO. Can you at least
>> run those to see if the problem is reproducible? Or tell me how to run
>> those? It’s included in one of the example templates, right?
>>
>>
>> On Nov 20, 2016, at 5:30 PM, Pat Ferrel (JIRA) <j...@apache.org> wrote:
>>
>>
>> [ https://issues.apache.org/jira/browse/PIO-45?page=com.atlass
>> ian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>
>> Pat Ferrel updated PIO-45:
>> --
>>Description:
>> as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>>
>> Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>>
>> [~emergentorder] can you check to see if the PIO merge was done correctly.
>>
>>  was:
>> as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>>
>> Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>>
>> [~amerritt] can you check to see if the PIO merge was done correctly.
>>
>>
>> > SelfCleaningDatasource erases all data
>> > --
>> >
>> >Key: PIO-45
>> >URL: https://issues.apache.org/jira/browse/PIO-45
>> >Project: PredictionIO
>> > Issue Type: Bug
>> >   Affects Versions: 0.10.0-incubating
>> >   Reporter: Pat Ferrel
>> >   Assignee: Alexander  Merritt
>> >   Priority: Critical
>> >Fix For: 0.11.0
>> >
>> >
>> > as integrated into the UR, in the integration-test, the
>> SelfCleaningDataset erases all data. This feature works fine in the AML
>> version of PIO.
>> > Although not tested one could assume that this would be true with any
>> other Datasource in other templates.
>> > [~emergentorder] can you check to see if the PIO merge was done
>> correctly.
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>>
>>


Re: [VOTE] Apache PredictionIO (incubating) 0.10.0 Release (RC5)

2016-10-02 Thread Alex Merritt
+1 binding

On Sat, Oct 1, 2016 at 3:17 PM, Donald Szeto  wrote:

> Hi all,
>
> Thanks for voting. Please indicate binding or non-binding when you cast
> your vote. Release votes require 3 binding +1's from PMC to pass.
>
> Regards,
> Donald
>
> On Saturday, October 1, 2016, Kenneth Chan  wrote:
>
> > +1
> >
> > On Saturday, October 1, 2016, Pat Ferrel  > > wrote:
> >
> > > +1 binding
> > >
> > > On Oct 1, 2016, at 10:20 AM, Suneel Marthi  > 
> > > > wrote:
> > >
> > > +1 binding
> > >
> > > On Sat, Oct 1, 2016 at 12:05 PM, Matthew Tovbin  > 
> > > > wrote:
> > >
> > > > +1
> > > >
> > > > - Matthew
> > > >
> > > > On Oct 1, 2016 00:18, "Donald Szeto"  
> > >
> > > wrote:
> > > >
> > > >> This is the vote for 0.10.0 of Apache PredictionIO (incubating).
> > > >>
> > > >> The vote will run for at least 72 hours and will close on Oct 3rd,
> > 2016.
> > > >>
> > > >> RC2 adds the "apache-" prefix to artifact filenames.
> > > >>
> > > >> RC3 adds on top of RC2 with proper licenses and notices embedded in
> > the
> > > >> Maven artifacts. It also changes the license of the documentation
> from
> > > >> Creative Commons to APLv2.
> > > >>
> > > >> RC4 fixes a build error of RC3.
> > > >>
> > > >> RC5 fixes issues raised by the IPMC:
> > > >> - Removed 3rd party dependencies from documentation sources
> > > >> - Fixed incorrect licensing of semver.sh from ASF to BSD
> > > >> - Moved MySQL connector to optional scope
> > > >>
> > > >> The release candidate artifacts can be downloaded here:
> > > >> https://dist.apache.org/repos/dist/dev/incubator/
> predictionio/0.10.0-
> > > >> incubating-rc5/
> > > >>
> > > >> Test results of RC5 can be found here:
> > > >> https://travis-ci.org/apache/incubator-predictionio/builds/
> 164221633
> > > >>
> > > >> Maven artifacts are built from the release candidate artifacts
> above,
> > > and
> > > >> are provided as convenience for testing with engine templates. The
> > Maven
> > > >> artifacts are provided at the Maven staging repo here:
> > > >> https://repository.apache.org/content/repositories/
> > > >> orgapachepredictionio-1009/
> > > >>
> > > >> All JIRAs completed for this release are tagged with 'FixVersion =
> > > > 0.10.0'.
> > > >> You can view them here:
> > > >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > > >> projectId=12320420=12337844
> > > >>
> > > >> The artifacts have been signed with Key : 8BF4ABEB
> > > >>
> > > >> Please vote accordingly:
> > > >>
> > > >> [ ] +1, accept RC as the official 0.10.0 release
> > > >> [ ] -1, do not accept RC as the official 0.10.0 release because...
> > > >>
> > > >
> > >
> > >
> >
>


Re: [VOTE]: Apache PredictionIO (incubating) 0.10.0 Release

2016-09-15 Thread Alex Merritt
+1 (binding)

On Sep 15, 2016 1:49 PM, "Suneel Marthi"  wrote:

> Folks, When u vote please specify "+1 Binding" if u r a PMC member. Its
> only the PMC votes that count for a release to pass.
>
>
>
> On Thu, Sep 15, 2016 at 2:11 PM, Robert Lu  wrote:
>
> > +1
> >
> > > On Sep 15, 2016, at 01:13, Matthew Tovbin  wrote:
> > >
> > > +1
> > >
> > > On Wed, Sep 14, 2016 at 10:12 AM, Pat Ferrel 
> > wrote:
> > >
> > >> +1
> > >>
> > >>
> > >> On Sep 13, 2016, at 11:55 AM, Donald Szeto 
> > wrote:
> > >>
> > >> This is the vote for 0.10.0 of Apache PredictionIO (incubating).
> > >>
> > >> The vote will run for at least 72 hours and will close on Sept 16th,
> > 2016.
> > >>
> > >> The artifacts can be downloaded here:
> > >> https://dist.apache.org/repos/dist/dev/incubator/predictioni
> > >> o/0.10.0-incubating-rc1/
> > >> or
> > >> from the Maven staging repo here:
> > >> https://repository.apache.org/content/repositories/orgapache
> > >> predictionio-1001/
> > >>
> > >> All JIRAs completed for this release are tagged with 'FixVersion =
> > 0.10.0'.
> > >> You can view them here:
> > >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
> > >> ctId=12320420=12337844
> > >>
> > >> The artifacts have been signed with Key : 8BF4ABEB
> > >>
> > >> Please vote accordingly:
> > >>
> > >> [ ] +1, accept RC as the official 0.10.0 release
> > >> [ ] -1, do not accept RC as the official 0.10.0 release because...
> > >>
> > >>
> >
> >
>


Re: Release Manager

2016-09-08 Thread Alex Merritt
Yes, I've tested all 4 options on Debian and Ubuntu, though someone else
sanity checking that couldn't hurt. Suneel tested on OS X and found only 1
of the 4 worked (any Mac folks out there, please contribute fixes for the
others, though maybe not for this release).
Alex

On Sep 8, 2016 12:15 PM, "Pat Ferrel" <p...@occamsmachete.com> wrote:

> Thanks Donald!
>
> We still need to cleanup 2 things,
> 1) the LICENSE.txt and NOTICE.txt now has to reflect only sources
> included. I’ll update the license Jira and will do the editing (easy now).
> 2) the install.sh *needs testing*, please anyone who can try it, on
> different distros (Mac, Red Hat-based, Debian-based) can you report back?
> Try the various options.
>
> Alex, is the install.sh in develop setup correctly?
>
>
> On Sep 8, 2016, at 11:04 AM, Alex Merritt <leca...@gmail.com> wrote:
>
> Thanks Donald!
>
> On Sep 8, 2016 9:56 AM, "Donald Szeto" <don...@apache.org> wrote:
>
> > Hi all,
> >
> > If there's no objection, I'll volunteer myself to be the release manager
> > for the first Apache release. I'll make a few more documentation updates,
> > and start shepherding the Apache release process.
> >
> > Regards,
> > Donald
> >
>
>


Re: Release Manager

2016-09-08 Thread Alex Merritt
Thanks Donald!

On Sep 8, 2016 9:56 AM, "Donald Szeto"  wrote:

> Hi all,
>
> If there's no objection, I'll volunteer myself to be the release manager
> for the first Apache release. I'll make a few more documentation updates,
> and start shepherding the Apache release process.
>
> Regards,
> Donald
>


Re: Binary or Source release

2016-09-05 Thread Alex Merritt
Agree we should go source only for this release.

On Sep 5, 2016 1:10 PM, "Suneel Marthi"  wrote:

> On Mon, Sep 5, 2016 at 2:55 PM, Andrew Purtell 
> wrote:
>
> > I also don't have experience with SBT, apologies. I did do some poking
> > around on Google and it looks like SBT is well behind Maven in providing
> > this type of functionality out of the box or by third party plugin
> > (sbt-assembly does some useful and interesting things but is focused
> > exclusively on producing über jars). I think that's to be expected given
> > the origin story. "Maven is huge and crufty and we want new and simple!"
> > "Ok, let's make Simple Build Tool!" Fast forward. No longer simple. Not
> > able to do a lot of what Maven can. Years of reinventing the wheel ahead,
> > ahoy! Happy to be corrected.
> >
>
> Heh, not to mention that Sbt is just not as flexible as maven in being able
> to handle different phases and cycles of build and deployment.
>
>
> > Doing what I've described looks achievable by programming what is needed
> > in SBT's DSL. Source only releases for a while maybe? Or work up LICENSE
> > and NOTICE files by hand and figure out how to break release builds if
> > dependencies change and the metadata hasn't been updated by hand?
> >
> > I was wondering why Spark went with Maven for their build of reference.
> >
> > My little rant on SBT aside I am NOT suggesting you replace SBT with
> > Maven. That would be in my opinion an unfortunate use of developer
> > bandwidth better put to task getting the current software with current
> > build system out the door in a first Apache release.
> >
>
> +1 and we all seem to agree for a quick source-only first release.
>
>
> > > On Sep 5, 2016, at 11:23 AM, Suneel Marthi  wrote:
> > >
> > > Its easy to do what Andy is describing using maven's assembly plugin in
> > the
> > > maven world. I have no experience with sbt so can't speak to how it can
> > be
> > > done with Sbt and would defer that to the experts.
> > >
> > > We hit a similar issue with licenses in source and binary on the first
> > Pirk
> > > release last week. We finally decided to make a source-only first
> release
> > > while we r now working on fixing the binary license packaging for the
> > next
> > > release.
> > >
> > >
> > >
> > > On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <
> andrew.purt...@gmail.com
> > >
> > > wrote:
> > >
> > >> It covers LICENSE and NOTICE file generation for both source and
> binary
> > >> releases, and inclusion of the resulting files in source archives,
> > binary
> > >> jars, and binary archives through integration with the maven build and
> > >> assembly targets.
> > >>
> > >> Including the complete text of any given license in LICENSE is
> important
> > >> but only needs to be done once. You retain the copyright notice and
> > mention
> > >> of the license type per dependency. We are just talking about
> > >> deduplicating, eg 100 full texts of the ASLv2 into one.
> > >>
> > >>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel 
> wrote:
> > >>>
> > >>> Thanks Andy.
> > >>>
> > >>> RE “Only need to include one entry with the complete text of a
> license,
> > >> everything else can just name the license.” So the copyright notice in
> > the
> > >> license is not important, only the license type? This is often the
> only
> > >> important difference in the license from one dep to another.
> > >>>
> > >>> It sounds like your automation covered LICENSE.txt creation? or just
> > >> inclusion in the binary?
> > >>>
> > >>>
> >  On Sep 5, 2016, at 9:59 AM, Andrew Purtell <
> andrew.purt...@gmail.com>
> > >>> wrote:
> > >>>
> > >>> I won't weigh in on the question at hand but I'd like to make a
> couple
> > >> of clarifications for what it is worth:
> > >>>
> >  This yielded 166 deps, so this implies we need to include 166
> licenses
> > >> and copyright notices in LICENSE.txt.
> > >>>
> > >>> There are some available simplifications:
> > >>>
> > >>> - Only need to include one entry with the complete text of a license,
> > >> everything else can just name the license.
> > >>>
> > >>> - Where there are multiple artifacts coming from a single project,
> like
> > >> Hadoop, only one entry for the project is needed.
> > >>>
> >  Donald is looking at automating this but I’m personally dubious
> > >>>
> > >>> As I think I've mentioned before here we have successfully automated
> > >> this for HBase (based on automation done by yet other Apache projects)
> > so I
> > >> hope you'll take my advice and evidence based assertion it can be
> done.
> > >> Caveat: we use maven not SBT as build framework.
> > >>>
> > >>>
> > 
> >  On Sep 5, 2016, at 9:43 AM, Pat Ferrel  wrote:
> > 
> >  This weekend I tracked down all out deps, which required a few
> scripts
> > >> to process sbt output. This yielded 166 deps, so this implies we need
> to
> > >> include 166 

Re: PIO-20 problems

2016-08-13 Thread Alex Merritt
Resolved, passing in Travis now.

On Fri, Aug 12, 2016 at 3:04 PM, Pat Ferrel  wrote:

> Alex, can you look at these unit test failures on the PR, they seem to be
> in JDBCPEvents
>
> https://travis-ci.org/apache/incubator-predictionio/builds/151905196
>
>
> On Aug 12, 2016, at 1:18 PM, Pat Ferrel  wrote:
>
> Can't install unittest with pip or pip3 even though the rest of the
> prerequisites work. Simply refuses to find it. Pip3 search unittest give
> this:
>
> WebTestRunner (0.2)- Web-based interface for selectively
> executing client-side Python UnitTests
> unittest (0.0) -
> nosetests-json-extended (0.1.0)- Create json logging output for
> pythonnosetests unittest framework
>
> no description and 0.0 seems odd. Pip3 install unittest gives:
>
> Collecting unittest
>  Could not find a version that satisfies the requirement unittest (from
> versions: )
> No matching distribution found for unittest
>
> Tried forcing the version to 0.0 but again no luck
>
> Ideas? In the meantime waiting for Travis to do it—sigh
>
>
> On Aug 11, 2016, at 2:31 PM, Pat Ferrel  wrote:
>
> With the keystore my template-based integration test passes and I’ve put
> the keystore back in. The diffs on PRs on Github seem completely wonky
> right now. Git diff is trustworthy at least.
>
> Will try the python tests now too.
>
> Thanks guys, working smoothly now.
>
>
> On Aug 11, 2016, at 11:53 AM, Donald Szeto  wrote:
>
> Hi all,
>
> I went ahead and pulled Pat's branch, performed a clean build, and repeated
> the quick start guide of the Scala parallel recommendation template. I
> could produce the same problem, and root caused it to a missing
> conf/keystore.jks file.
>
> I think with SSL now optional, we should not be distributing a KeyStore
> file. We can either quick fix it now by putting this file back, or modify
> SSLConfiguration to not look for this file when SSL support is off.
>
> Regards,
> Donald
>
> On Thu, Aug 11, 2016 at 8:03 AM, Chan Lee  wrote:
>
> > I want to clarify some points:
> >
> > 2) Deploying templates do not require SSL. If you execute
> > ./make-distribution.sh using the current develop branch (provided you
> > change the namespace in the template from io.prediction to
> > org.apache.predictionio), you can deploy without SSL on localhost. The
> > travis tests also do not use SSL and pass as can be seen in
> > https://travis-ci.org/Ziemin/incubator-predictionio.
> >
> > 3) You can run the tests locally the same way it is run on travis
> > ('python3 ${PIO_DIR}/tests/pio_tests/tests.py'). It is documented in the
> > README in the same directory, but maybe this was not clear enough. If you
> > have any additional questions or run into issues executing the tests,
> > please let me know.
> >
> > Thanks,
> > Chan
> >
> > On Thu, Aug 11, 2016 at 6:07 AM, Pat Ferrel 
> wrote:
> >
> >> I’m always disinclined to commit code that has not been tested. However
> >> many things in the current state of the develop branch seem broken or at
> >> least too big to address in a single commit.
> >>
> >> 1) Style test fail in Travis that are not runnable or at least have not
> >> been documented for local build. I have fixed these.
> >> 2) SSL seems to still be required for deploying an engine, PIO-1 may not
> >> have addressed this. Has anyone tried to build and test a template yet
> >> without SSL? Unless someone can state otherwise I’m inclined to ignore
> this
> >> failure for now since it is out of scope for this commit.
> >> 3) the test framework has been integrated into Travis but again no
> >> documented way to run it locally and some of the travis errors seem
> >> spurious. I am also not inclined to wait for this to be fixed unless
> >> someone can point out real problems it is discovering and send
> instructions
> >> for how to run it locally.
> >>
> >> Before the extra travis tests I was able to build and train a template,
> >> which fails on deploy with an SSL config error. Since the PR does not
> >> strictly touch any templates I am going to have to ignore this serious
> >> issue and address it in Jira.
> >>
> >> Unless someone vetos this I plan to reluctantly push the PR with these
> >> non-trivial issues remaining.
> >>
> >>
> >>
> >
>
>
>
>