Re: [ML] Deployment of user-defined preprocessors

2019-06-03 Thread dmitrievanthony
It's an amazing idea! I faced this problem several times and it's really annoying, I'd be glad to this problem fixed. Best regards, Anton Dmitriev. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Re: [ML][DISCUSSION] The future of Vectorizer

2019-03-28 Thread dmitrievanthony
It's a brilliant idea, I agree! -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Re: Tests for ML using binary builds

2019-03-05 Thread dmitrievanthony
Hi Alexey, I think it's a great idea. Travis + Docker is a very good and cheap solution, so we could start with it. Regards the statistics, Travis allows to check a last build status using a badge, so it also shouldn't be a problem. Best regards, Anton Dmitriev. -- Sent from:

Re: Ignite ML withKeepBinary cache

2019-01-02 Thread dmitrievanthony
Hi, I guess we have plans to support caches with binary objects in ML. Please have a look the following JIRA for details: https://issues.apache.org/jira/browse/IGNITE-10700. Best regards, Anton Dmitriev. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

What is the best approach to extend Thin Client functionality?

2018-12-17 Thread dmitrievanthony
Currently ML/TensorFlow module requires an ability to expose some functionality to be used in C++ code. As far as I understand, currently Ignite provides an ability to work with it from C++ only through the Thin Client. The list of operations supported by it is very limited. What is the best

[GitHub] ignite pull request #5533: IGNITE-10289: Import models from XGBoost

2018-11-29 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5533 IGNITE-10289: Import models from XGBoost You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-10289

[GitHub] ignite pull request #5526: IGNITE-10449: Fix javadoc and typos.

2018-11-28 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5526 IGNITE-10449: Fix javadoc and typos. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-10449

[GitHub] ignite pull request #5514: IGNITE-10429: Wrap Scanner in DirectorySerializer...

2018-11-27 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5514 IGNITE-10429: Wrap Scanner in DirectorySerializerTest into try-with-resources. You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] ignite pull request #5507: IGNITE-10287: Add ML inference model storage.

2018-11-27 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5507 IGNITE-10287: Add ML inference model storage. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-10287

[GitHub] ignite pull request #5461: IGNITE-10370: Fix TensorFlowLocalInferenceExample...

2018-11-21 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5461 IGNITE-10370: Fix TensorFlowLocalInferenceExample fail on Windows You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache

[GitHub] ignite pull request #5415: IGNITE-10234: Inference workflow for ML

2018-11-16 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5415 IGNITE-10234: Inference workflow for ML You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-10234

[GitHub] ignite pull request #5249: IGNITE-10133: Switch to per-node TensorFlow worke...

2018-11-07 Thread dmitrievanthony
Github user dmitrievanthony closed the pull request at: https://github.com/apache/ignite/pull/5249 ---

[GitHub] ignite pull request #5313: IGNITE-10149: Make ignite-tf.sh executable by def...

2018-11-07 Thread dmitrievanthony
Github user dmitrievanthony closed the pull request at: https://github.com/apache/ignite/pull/5313 ---

[GitHub] ignite pull request #5313: IGNITE-10149: Make ignite-tf.sh executable by def...

2018-11-06 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5313 IGNITE-10149: Make ignite-tf.sh executable by default. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite

[GitHub] ignite pull request #5249: IGNITE-10133: Switch to per-node TensorFlow worke...

2018-11-02 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/5249 IGNITE-10133: Switch to per-node TensorFlow worker strategy. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite

[GitHub] ignite pull request #4994: IGNITE-9889: Add JavaDoc group for TensorFlow.

2018-10-15 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4994 IGNITE-9889: Add JavaDoc group for TensorFlow. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-9889

Re: Apache Ignite 2.7 release

2018-10-04 Thread dmitrievanthony
Hi, Yury, Nikolay. This issue reproduces in "TensorFlow on Apache Ignite" use cases. When user prepares training script (like official MNIST model https://github.com/tensorflow/models/tree/master/official/mnist), runs it in distributed standalone client mode (see this documentation

[GitHub] ignite pull request #4912: IGNITE-9788: Import IgniteDataset explicitly in T...

2018-10-04 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4912 IGNITE-9788: Import IgniteDataset explicitly in TensorFlow worker code You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache

[GitHub] ignite pull request #4847: IGNITE-9706: Update ignite-tensorflow to support ...

2018-09-27 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4847 IGNITE-9706: Update ignite-tensorflow to support TensorFlow standalone client mode TF_CONFIG variable, make user script to use TF_CLUSTER variable. You can merge this pull request

[GitHub] ignite pull request #4780: IGNITE-9628 Avoid using sun.reflect.generics.refl...

2018-09-18 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4780 IGNITE-9628 Avoid using sun.reflect.generics.reflectiveObjects package in ML module in ML module. You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] ignite pull request #4778: IGNITE-9625 Fix ML javadoc

2018-09-18 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4778 IGNITE-9625 Fix ML javadoc You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-9625 Alternatively you can

Re: [ML] [New Feature] Trainers as pipeline parameters that can be varied

2018-09-14 Thread dmitrievanthony
In case HTML doesn't work I'll duplicate the message. Hi all, A machine learning pipeline implemented in https://issues.apache.org/jira/browse/IGNITE-9158 (see discussion here http://apache-ignite-developers.2346864.n4.nabble.com/ML-Machine-Learning-Pipeline-Improvement-tt32772.html) supports

[ML] [New Feature] Trainers as pipeline parameters that can be varied

2018-09-14 Thread dmitrievanthony
Hi all,A machine learning pipeline implemented in IGNITE-9158 (see discussion here ) supports hyperparameters variation, but not

Re: How to reduce Scan Query execution time?

2018-08-30 Thread dmitrievanthony
Yes, of course I started with measuring of total iteration time. After that I found that throughput is about 200Mb/s, then I started looking for a bottleneck. Because "downloading" time is less than "waiting" time I conclude that "waiting" step is bottleneck and so that this thread has been

Re: How to reduce Scan Query execution time?

2018-08-30 Thread dmitrievanthony
To be precise it's not only about first page, it's about getting next pages as well. Regarding use case, in my client application I need to iterate over the dataset stored in Apache Ignite as fast as it possible. It means I should provide maximal throughput for simple "read all" operation. --

Re: How to reduce Scan Query execution time?

2018-08-30 Thread dmitrievanthony
BTW, measurements for the example I've been talking above: Page size 5 Mb, waiting time 119.85 ± 6.72 ms Page size 10 Mb, waiting time 157.70 ± 15.35 ms Page size 20 Mb, waiting time 204.50 ± 19.18 ms Page size 50 Mb, waiting time 264.70 ± 22.30 ms Page size 100 Mb, waiting time 463.35 ± 17.12 ms

Re: How to reduce Scan Query execution time?

2018-08-30 Thread dmitrievanthony
I have already experimented with different page sizes and found out that "downloading" time is relatively small compare to this "waiting" time, so I've decided that this "waiting" is bottleneck and that's why I'm talking about it and measuring it. In case of AWS 10Gbit network allows us to receive

Re: How to reduce Scan Query execution time?

2018-08-30 Thread dmitrievanthony
Hi, I prepared an example that reproduces what I'm talking about. Please take a look: https://github.com/dmitrievanthony/slow-scan-query-reproducer/blob/master/src/main/java/Client.java. I calculate time between the has been sent and the result is ready to be received (not fully received). And I

Re: Binary Client Protocol client hangs in case of OOM on server

2018-08-30 Thread dmitrievanthony
BTW, Taras has created the ticket https://issues.apache.org/jira/browse/IGNITE-9379. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Re: Binary Client Protocol client hangs in case of OOM on server

2018-08-24 Thread dmitrievanthony
I've rebuilt Ignite from master and tried again, but the behaviour is the same. If the page is very big (compare with heap size) server node fails with OOM and the client hangs. [14:54:06,273][SEVERE][client-connector-#101][ClientListenerProcessor] Runtime error caught during grid runnable

Re: Binary Client Protocol client hangs in case of OOM on server

2018-08-23 Thread dmitrievanthony
Is it a parameter of query? I see it in the list of OP_QUERY_SQL parameters, but not in list of OP_QUERY_SCAN which I use (https://apacheignite.readme.io/v2.6/docs/binary-client-protocol-sql-operations). -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Binary Client Protocol client hangs in case of OOM on server

2018-08-23 Thread dmitrievanthony
When I'm sending Scan Query request via Binary Client Protocol with very big page size I get OOM on the server node: java.lang.OutOfMemoryError: Java heap space at org.apache.ignite.internal.binary.streams.BinaryMemoryAllocatorChunk.reallocate(BinaryMemoryAllocatorChunk.java:69) at

Re: How to reduce Scan Query execution time?

2018-08-23 Thread dmitrievanthony
I checked and it looks like the result is the same (or even worse, I get 1150ms with page size 1000, but the reason might be in other changes, previous measures I did using 2.6). -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

How to reduce Scan Query execution time?

2018-08-23 Thread dmitrievanthony
Hi, I have a cache with 5000 objects, 400Kb each and I need to download all these objects using Binary Client Protocol. To do than I use Scan Query (and Load Next Page to load page 2, 3, etc...) request without any filter. I measure the time between two moments: when request has been sent and

[GitHub] ignite pull request #4601: IGNITE-9338 Add connection data int env variables...

2018-08-23 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4601 IGNITE-9338 Add connection data int env variables of TensorFlow worker processes You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] ignite pull request #4557: IGNITE-9278 Fix TensorFlow integration: Can't fin...

2018-08-16 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4557 IGNITE-9278 Fix TensorFlow integration: Can't find free ports in range You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache

[GitHub] ignite pull request #4491: IGNITE-9193 Stop child python processes on parent...

2018-08-07 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4491 IGNITE-9193 Stop child python processes on parent stop. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite

[GitHub] ignite pull request #4412: IGNITE-9055 Update Cache Based Dataset so that it...

2018-07-23 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4412 IGNITE-9055 Update Cache Based Dataset so that it ignores empty parts You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache

[GitHub] ignite pull request #4402: IGNITE-9034 Add Estimator API support to TensorFl...

2018-07-23 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4402 IGNITE-9034 Add Estimator API support to TensorFlow cluster on top of Apache Ignite You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] ignite pull request #4214: IGNITE-8795 Add ability to start and maintain Ten...

2018-06-18 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4214 IGNITE-8795 Add ability to start and maintain TensorFlow cluster on top of Apache Ignite You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] ignite pull request #4143: IGNITE-8668 K-fold cross validation of models

2018-06-06 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4143 IGNITE-8668 K-fold cross validation of models You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-8668

[GitHub] ignite pull request #4124: IGNITE-8667 Splitting of dataset to test and trai...

2018-06-04 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4124 IGNITE-8667 Splitting of dataset to test and training sets You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite

[GitHub] ignite pull request #4101: IGNITE-8666 Add ability of filtering data during ...

2018-05-31 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/4101 IGNITE-8666 Add ability of filtering data during datasets creation …asetBuilder. You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] ignite pull request #3807: IGNITE-8233 KNN and SVM algorithms don't work whe...

2018-04-12 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3807 IGNITE-8233 KNN and SVM algorithms don't work when partition doesn't contain data. You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] ignite pull request #3806: IGNITE-8232 ML package cleanup for 2.5 release.

2018-04-12 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3806 IGNITE-8232 ML package cleanup for 2.5 release. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-8232

[GitHub] ignite pull request #3760: IGNITE-8059 Integrate decision tree with partitio...

2018-04-05 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3760 IGNITE-8059 Integrate decision tree with partition based dataset You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache

[GitHub] ignite pull request #3673: IGNITE-7990 Integrate MLP with partition based da...

2018-03-21 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3673 IGNITE-7990 Integrate MLP with partition based dataset You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite

[GitHub] ignite pull request #3614: IGNITE-7897 Add example for LSQR with data normal...

2018-03-07 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3614 IGNITE-7897 Add example for LSQR with data normalization. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite

[GitHub] ignite pull request #3494: IGNITE-7438 LSQR solver for Linear Regression

2018-02-08 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3494 IGNITE-7438 LSQR solver for Linear Regression You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-7438

[GitHub] ignite pull request #3472: IGNITE-7437 Fix javadoc in partition based datase...

2018-02-05 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3472 IGNITE-7437 Fix javadoc in partition based dataset. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-7437

[GitHub] ignite pull request #3410: IGNITE-7437 Partition based dataset implementatio...

2018-01-21 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3410 IGNITE-7437 Partition based dataset implementation You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-7437

[GitHub] ignite pull request #3316: IGNITE-7332: Add test suite for ML examples

2017-12-28 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3316 IGNITE-7332: Add test suite for ML examples You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite IGNITE-7332

[GitHub] ignite pull request #3308: IGNITE-5217: Add Gradient Descent and QR-based tr...

2017-12-28 Thread dmitrievanthony
GitHub user dmitrievanthony opened a pull request: https://github.com/apache/ignite/pull/3308 IGNITE-5217: Add Gradient Descent and QR-based trainers for Linear Regression You can merge this pull request into a Git repository by running: $ git pull https://github.com

Re: Contributor permission request

2017-12-21 Thread dmitrievanthony
My JIRA username is dmitrievanthony. Best regards, Anton Dmitriev -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Contributor permission request

2017-12-21 Thread dmitrievanthony
Hi, my name is Anton Dmitriev. I'd like to contribute into your project into "ml" module. I'd like to start with task 5217 . This task was suggested for me by Yuri Babak. Best regards, Anton Dmitriev -- Sent from: