Re: [VOTE] Release Apache Ignite 2.7.5-rc4

2019-06-06 Thread Yuriy Babak
+1 - checked ML examples

Best regards,
Yuriy Babak


чт, 6 июн. 2019 г. в 19:16, Павлухин Иван :

> +1
> Launched a couple of nodes using bin package. Run SQL examples
> successfully (but with some troubles along the way).
>
> чт, 6 июн. 2019 г. в 17:01, Igor Sapego :
> >
> > +1 - checked C++ build
> >
> > Best Regards,
> > Igor
> >
> >
> > On Thu, Jun 6, 2019 at 4:33 PM Alexey Goncharuk <
> alexey.goncha...@gmail.com>
> > wrote:
> >
> > > +1 (binding) - checked build from source, persistence example and basic
> > > control.sh commands.
> > >
> > > ср, 5 июн. 2019 г. в 17:48, Andrey Gura :
> > >
> > > > +1 (binding)
> > > >
> > > > On Wed, Jun 5, 2019 at 4:39 PM Ilya Kasnacheev
> > > >  wrote:
> > > > >
> > > > > +1
> > > > >
> > > > > Built C++ and .Net successfully, started .Net <-> Java <-> C++ SSL
> > > > Cluster
> > > > > on Java 11 and Java 8 without tuning any flags.
> > > > > --
> > > > > Ilya Kasnacheev
> > > > >
> > > > >
> > > > > ср, 5 июн. 2019 г. в 09:44, Nikolay Izhikov :
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > В Ср, 05/06/2019 в 01:11 +0100, Denis Magda пишет:
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > Nice to have clusters nodes started with ignite.sh in my JVM 11
> > > > > > environment
> > > > > > > without any settings.
> > > > > > >
> > > > > > > -
> > > > > > > Denis
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Jun 4, 2019 at 6:34 PM Dmitriy Pavlov <
> dpav...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Dear Community,
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > We have uploaded release candidate to
> > > > > > > > https://dist.apache.org/repos/dist/dev/ignite/2.7.5-rc4/
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > The following staging can be used for a dependent project for
> > > > testing:
> > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> https://repository.apache.org/content/repositories/orgapacheignite-1463/
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > This is half-release before 2.8 containing Java 11 support
> and
> > > > fixes
> > > > > > for
> > > > > > > > Native Persistence storage.
> > > > > > > >
> > > > > > > > Tag name is 2.7.5-rc4:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=tag;h=refs/tags/2.7.5-rc4
> > > > > > > >
> > > > > > > > 2.7.5 changes:
> > > > > > > >
> > > > > > > > * Added Java 11 support
> > > > > > > >
> > > > > > > > * Fixed infinite looping during SSL handshake, affecting Java
> > > > > > 11/Windows
> > > > > > > >
> > > > > > > > * Fixed storage corruption case after incorrectly rotated
> page
> > > > > > > >
> > > > > > > > * Erroneous WAL record after incorrectly rotated page
> processed
> > > > > > > > automatically
> > > > > > > >
> > > > > > > > * Addressed ignite.sh failure on Mac OS and Linux, affecting
> Java
> > > > 11
> > > > > > > >
> > > > > > > > * Launch scripts and some Ignite initialization steps were
> fixed
> > > > for
> > > > > > Java
> > > > > > > > 12
> > > > > > > >
> > 

Re: [ML] Deployment of user-defined preprocessors

2019-06-05 Thread Yuriy Babak
Alexey,

This is a cool change, do you create a ticket for it?

If no I can create one.

Best regards,
Yuriy Babak


пт, 31 мая 2019 г. в 14:20, Алексей Платонов :

> Hi, Igniters!
> Currently we don't have an ability to deploy automatically user-defined
> preprocessors and vectorizers. Client's code should be deployed manually to
> Ignite server nodes.
>
> I have an idea how to fix it. If we pass user's classloader and one of
> user-defined classes from fit-level to
> ComputeUtils.affinityCallWithRetries() then we wiil be able to use
> GridPeerDeployAware interface to send informtation about this classloader
> to server nodes.
>
> To support this ability we can define interfaces like these:
>
> public interface DeployableObject {
> public List getDependencies();
> }
>
> and
>
> public interface DeployingContext {
> public Class userClass();
> public ClassLoader clientClassLoader();
> }
>
> DeployableObject will be mark for our ignite-ml final classes like trainers
> or concrete preprocessors and it can be able to return all dependencies
> that should be deployed to server nodes if it's needed. If these
> dependencies are DeployableObjects too then depenndencies will be unfolded
> recursively. Classes that isn't defined as DeployableObject will be
> recognized as user-defined (NOTE: all leaf classes in our hierarchy will be
> DeployableObject).
>
> This list of DeployableObjects will be user for define user class loader
> and one of these objects will be used for passing to GridPeerDeployAware.
>
> So, this logic allows to pass user-defined Preprocessors and Vectorizers to
> training algorithms and pipelines.
>
> What do you think?
>
> Sincerely
> Alexey Platonov
>


Root readme.md update

2019-02-08 Thread Yuriy Babak
Igniters,

I noticed that our root readme.md outdated. So I suggest remove sections
about hadoop accelerator and IGFS.

Also, I want to add a section about ML.

Any thoughts?

Best regards,
Yuriy Babak


[ML] move examples of model import into separate folder

2019-01-23 Thread Yuriy Babak
Igniters,

Currently our examples of import models from Spark(parquet and PMML)
located in the "inference" folder.

I want to suggest moving those examples into a separate folder "import" or
"migration" which should be located on the same level.

Any thoughts about this suggestion?

Best regards,
Yuriy Babak


Re: Ignite ML withKeepBinary cache

2019-01-21 Thread Yuriy Babak
Hi all,

Ticket 10700 [1] is resolved, this ticked added support of training models
over a cache with binary objects(cache with enabled keepBinary flag) for
more details please take a look the mentioned ticked or added example [2].

[1] - https://issues.apache.org/jira/browse/IGNITE-10700
[2] - org.apache.ignite.examples.ml.TrainingWithBinaryObjectExample

Sincerely,
Best regards,
Yuriy Babak


чт, 10 янв. 2019 г. в 14:07, Alexey Zinoviev :

> Thanks a lot for the example. Will write later about keepBinary support in
> this thread.
>
> чт, 10 янв. 2019 г. в 13:28, otorreno :
>
> > Alexey, thanks for your support.
> >
> > Answer to your questions:
> > 1) At the moment the types are: String, Long and Double. But this could
> > actually change in the future to any other user-defined types/classes (We
> > know we would need to provide data encoders for such types)
> > 2) Yes, all data series have the same schema (same number of columns and
> > same types)
> > 3)  all_sites.csv
> > <
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/file/t659/all_sites.csv
> >
> >
> > contains an example of what data we are trying to work with. Each of the
> > rows of such file contains the metadata of a given data series, as I
> > described in my first post of this thread.
> >
> > Remember that we have such table stored in a cache which uses the
> > withKeepBinary method. And the problem I faced was not being able to use
> > such cache as input to the ML algos (a copy of such cache to a cache
> > without
> > the keepBinary property would work, but that is not the solution we want
> to
> > apply). What I would like to do is add support to caches with keepBinary
> to
> > Ignite ML.
> >
> > Best,
> > Oscar
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >
>


Re: [DISCUSSION] Drop Scala 2.10 for Apache Ignite 2.8

2019-01-13 Thread Yuriy Babak
In that case my vote for 3.0 release

Best regards,
Yuriy Babak


пт, 11 янв. 2019 г. в 21:29, Denis Magda :

> 1500 (2.11) vs 150 (2.10) - downloads of ignite-scalar artifacts of defined
> version. My point is that those few who are on 2.10 should be considered -
> we usually don't do breaking changes in minor releases, only major.
>
> Ignite 3.0?
>
> --
> Denis
>
> On Fri, Jan 11, 2019 at 12:59 AM Anton Vinogradov  wrote:
>
> > Denis,
> >
> > Could you please specify what component and what type of downloads do you
> > mean.
> > What is the ratio between 2.10 and actual?
> >
> > P.s. There is a big chance that 150 downloads caused by TC or by Release
> > check.
> >
> > On Fri, Jan 11, 2019 at 2:17 AM Denis Magda  wrote:
> >
> > > I see that we're getting around 150 scala_2.10 downloads monthly. Is
> > there
> > > any other component which uses it? I would remove the module in 3.0 to
> > not
> > > break the compatibility.
> > >
> > > --
> > > Denis
> > >
> > > On Thu, Jan 10, 2019 at 8:13 AM Nikolay Izhikov 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > чт, 10 янв. 2019 г. в 19:09, Anton Vinogradov :
> > > >
> > > > > +1
> > > > >
> > > > > On Thu, Jan 10, 2019 at 6:02 PM Alexey Kuznetsov <
> > > akuznet...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > +1 to drop support of Scala 2.10 in Ignite 2.8.
> > > > > >
> > > > > >
> > > > > > On Thu, Jan 10, 2019 at 9:59 PM Yuriy Babak 
> > > wrote:
> > > > > >
> > > > > > > Hi Igniters,
> > > > > > >
> > > > > > > What do you think about the drop Scala 2.10 support?
> > > > > > >
> > > > > > > Currently, we support two versions of Scala - 2.10 and 2.11, I
> > > > suggest
> > > > > to
> > > > > > > drop 2.10 and use only 2.11. Originally we have the old version
> > of
> > > > > Scala
> > > > > > > for supporting old versions of Apache Spark.
> > > > > > >
> > > > > > > But support for Scala 2.10 was removed as of 2.3.0 Apache Spark
> > > > > released.
> > > > > > > And current version if 2.4.0.
> > > > > > >
> > > > > > > Please share your opinion
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Yuriy Babak
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Alexey Kuznetsov
> > > > > >
> > > > >
> > > >
> > >
> >
>


[DISCUSSION] Drop Scala 2.10 for Apache Ignite 2.8

2019-01-10 Thread Yuriy Babak
Hi Igniters,

What do you think about the drop Scala 2.10 support?

Currently, we support two versions of Scala - 2.10 and 2.11, I suggest to
drop 2.10 and use only 2.11. Originally we have the old version of Scala
for supporting old versions of Apache Spark.

But support for Scala 2.10 was removed as of 2.3.0 Apache Spark released.
And current version if 2.4.0.

Please share your opinion

Best regards,
Yuriy Babak


Re: [ML] Metric calculation for classification models

2018-12-13 Thread Yuriy Babak
Dmitriy,

Sure, all changes in ML module will be described on readme.io site with
next release (2.8).

Best regards,
Yuriy Babak


чт, 13 дек. 2018 г. в 17:21, Dmitriy Pavlov :

> Folks, I sometimes hear complains related to metrics and its clearness for
> end-users.
>
> Would you add a couple of words related to each value to wiki/readme.io?
>
> чт, 13 дек. 2018 г. в 17:13, Alexey Zinoviev :
>
> > So, I agree that we should avoid ineffective metrics calculations.
> > I think that in 2.8 release we should have
> >
> >1. BinaryClassificationMetric with all metrics from Wikipedia
> >2. Metric interface with 1 or two implementations in example folder or
> >in metric package like roc auc and accuracy
> >3. BinaryClassificationMetric and MultiClassClassificationMetrics
> should
> >implement new interface MetricGroup
> >
> > Will totally change the current PR according your recommendation
> >
> > чт, 13 дек. 2018 г. в 16:06, Алексей Платонов :
> >
> > > You can compute just TP (true-positive), FP, TN and FN counters and use
> > > them to evaluate Recall, Precision, Accuracy, ect. If you want to
> specify
> > > class for Pr evaluation, then you can compute Pr for first label as
> > > TP/(TP+FP) and for second label as TN/(TN+FN) for example. After it we
> > can
> > > unite all one-point metrics evaluation.
> > >
> > > In my opinion we can redesign metrics calculation and provide one-point
> > > metrics (like Pr, Re) and integral metrics like ROC AUC where one-point
> > > metrics can be calculated through TP,FP etc.
> > >
> > > Maybe you should design class BinaryClassificationMetric that computes
> > > these counters and provide methods like recall :: () -> double,
> precision
> > > :: () -> double, etc.
> > >
> > > чт, 13 дек. 2018 г. в 13:26, Yuriy Babak :
> > >
> > > > Igniters, Alexey
> > > >
> > > > I want to discuss the ticket 10371 [1], currently, we calculate 4
> > numbers
> > > > (true positive, true negative, false positive, false negative) for
> each
> > > > "point metric" like accuracy, recall, f-score and precision for each
> > > label.
> > > >
> > > > So for the full score we need calculates those 4 numbers 8 times. But
> > we
> > > > could calculate all 8 metrics(4 for the first label and 4 for the
> > second
> > > > label).
> > > >
> > > > I suggest introducing new API "point metric" for metrics like those
> > > > 4(accuracy, recall, f-score, and precision) and "integral metric" for
> > > > metrics like ROC AUC [2].
> > > >
> > > > Any thoughts would be appreciated.
> > > >
> > > > [1] - https://issues.apache.org/jira/browse/IGNITE-10371
> > > > [2] - https://issues.apache.org/jira/browse/IGNITE-10145
> > > >
> > >
> >
>


[ML] Metric calculation for classification models

2018-12-13 Thread Yuriy Babak
Igniters, Alexey

I want to discuss the ticket 10371 [1], currently, we calculate 4 numbers
(true positive, true negative, false positive, false negative) for each
"point metric" like accuracy, recall, f-score and precision for each label.

So for the full score we need calculates those 4 numbers 8 times. But we
could calculate all 8 metrics(4 for the first label and 4 for the second
label).

I suggest introducing new API "point metric" for metrics like those
4(accuracy, recall, f-score, and precision) and "integral metric" for
metrics like ROC AUC [2].

Any thoughts would be appreciated.

[1] - https://issues.apache.org/jira/browse/IGNITE-10371
[2] - https://issues.apache.org/jira/browse/IGNITE-10145


Re: [ML][DISCUSSION] ML Package reorganization

2018-12-12 Thread Yuriy Babak
Dmitry,

Yes, those changes will affect users. But we could postpone this change to
the next major release(3.0) and we could implement a simple migration tool
for the ML part.

Regards,
Yuriy

ср, 12 дек. 2018 г. в 15:10, Dmitriy Pavlov :

> Hi Yuri,
>
> Would it affect users?
>
> Ignite-code has separation of API classes (not internal) and internal
> stuff. Internal classes may be changed from version to version, but API
> can't be renamed/moved/and so on. What about ML?
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 11 дек. 2018 г. в 16:56, Yuriy Babak :
>
> > Igniters,
> >
> > I want to discuss package structure for ML module. Actually I don't like
> > our current package organization. The main problem is that we have lots
> of
> > different packages under root package.
> >
> > For examples algorithm related packages on same level like some utility
> > infrastructure packages such as environment(package with some classes
> > related with training environment).
> >
> > So I created branch [1] with example of new package structure. Please
> share
> > any feedback about this package structure and about this idea itself.
> >
> > Regards,
> > Yuriy.
> >
> > [1] -
> >
> >
> https://github.com/YuriBabak/ignite/tree/ignite-10641/modules/ml/src/main/java/org/apache/ignite/ml
> >
>


[ML][DISCUSSION] ML Package reorganization

2018-12-11 Thread Yuriy Babak
Igniters,

I want to discuss package structure for ML module. Actually I don't like
our current package organization. The main problem is that we have lots of
different packages under root package.

For examples algorithm related packages on same level like some utility
infrastructure packages such as environment(package with some classes
related with training environment).

So I created branch [1] with example of new package structure. Please share
any feedback about this package structure and about this idea itself.

Regards,
Yuriy.

[1] -
https://github.com/YuriBabak/ignite/tree/ignite-10641/modules/ml/src/main/java/org/apache/ignite/ml


Re: Apache Ignite 2.7 release

2018-10-04 Thread Yuriy Babak
Igniters,

We have new ticket related with TensorFlow integration:
https://issues.apache.org/jira/browse/IGNITE-9788

>From my point of view this fix is important for release and I want to
include it to 2.7.

Any objections?

пн, 20 авг. 2018 г. в 21:22, Nikolay Izhikov :

> Hello, Igniters.
>
> I'm release manager of Apache Ignite 2.7.
>
> It's time to start discussion of release. [1]
>
> Current code freeze date is September, 30.
> If you have any objections - please, responsd to this thread.
>
> [1] https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.7


Re: [ML] IGNITE-9282 Naive Bayes task split

2018-09-30 Thread Yuriy Babak
Hi Ravil,

I think this is a good idea. I prefer to have several small single-feature
tickets instead of a big one with several features.

I will start reviewing 9282 on this week. Also, I looking forward to seeing
those new tickets.

Regards,
Yuriy

вс, 30 сент. 2018 г. в 3:58, Ravil Galeyev :

> Hi Team,
> I work on implementing Naive Bayes classifiers.
>
> Withing IGNITE-9282  I
> implemented a Gaussian Bayes and created a PR
> https://github.com/apache/ignite/pull/4869
>
> But  there are already a lot of changes
> That's' why I'd like to create separate tasks for
> multinomial and Bernoulli Bayes classifiers and continue work.
>
> Any objections?
>
> Best regards,
> Ravil
>