date:20171130

RE: ci.ignite.apache: Run all for patch (jira number)

2017-11-30 Thread Cergey

But I want to run the tests against the patch in jira (it is kinda easier to 
make changes in the patch). Is that feasible ?

-Original Message-
From: Denis Magda [mailto:dma...@apache.org] 
Sent: Friday, December 1, 2017 4:53 AM
To: dev@ignite.apache.org
Subject: Re: ci.ignite.apache: Run all for patch (jira number)

Cergey,

You need to run the tests agains the pull-request. In your case it should be 
this one - pull/2970/merge

—
Denis

> On Nov 30, 2017, at 12:36 PM, Cergey  wrote:
> 
> Probably, I missed something in the patch 
> (https://issues.apache.org/jira/browse/IGNITE-6745 ) as builds do not (and 
> did not) start automatically. What may be wrong ? 
> 
> -Original Message-
> From: Cergey [mailto:cossa...@mail.ru.INVALID] 
> Sent: Friday, December 1, 2017 12:31 AM
> To: dev@ignite.apache.org
> Subject: ci.ignite.apache: Run all for patch (jira number)
> 
> Hi, igniters,
> 
> When trying to run "Run all for patch" build with parameter "jira number" 
> (existing, e.g. IGNITE-6745 or random), build fails 
> (https://ci.ignite.apache.org/viewType.html?buildTypeId=Ignite20Tests_RunAllTestBuilds)
>   with exception: java.net.ConnectException: Connection timed out. Is this 
> functionality not supported or I incorrectly set the jira number ?
> 
>

[jira] [Created] (IGNITE-7084) It is possible to generate incorrect configuration using Basic configuration screen

2017-11-30 Thread Pavel Konstantinov (JIRA)

Pavel Konstantinov created IGNITE-7084:
--

 Summary: It is possible to generate incorrect configuration using 
Basic configuration screen
 Key: IGNITE-7084
 URL: https://issues.apache.org/jira/browse/IGNITE-7084
 Project: Ignite
  Issue Type: Bug
Reporter: Pavel Konstantinov


Open Basic configuration screen
Create a new cluster
Set StaticIPFinder as a discovery
Remove address (127.0.0.1)
Save project, download and try to start a node using generated config
{code}
class org.apache.ignite.IgniteCheckedException: Failed to start SPI: 
TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, 
marsh=JdkMarshaller [], reconCnt=10, maxAckTimeout=60, forceSrvMode=false, 
clientReconnectDisabled=false]
at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:300)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:876)
at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1823)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:993)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1903)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1646)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1074)
at 
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:992)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:878)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:777)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:647)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:616)
at org.apache.ignite.Ignition.start(Ignition.java:347)
at 
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:302)
Caused by: class org.apache.ignite.spi.IgniteSpiException: Non-shared IP finder 
must have IP addresses specified in 
TcpDiscoveryIpFinder.getRegisteredAddresses() configuration property (specify 
list of IP addresses in configuration).
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:346)
at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1868)
at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

RE: Does the Ignite C# client support distributed queues?

2017-11-30 Thread Raymond Wilson

Looking at it I see it's blocked by 2701 (which has additional
dependencies, all of which say they are blocked by 2701).

I understand there is an intention to bring the C# client up to par with
the Java client. Is there a ticket/schedule yet for this?

Raymond.

-Original Message-
From: vkulichenko [mailto:valentin.kuliche...@gmail.com]
Sent: Friday, December 1, 2017 1:30 PM
To: u...@ignite.apache.org
Subject: RE: Does the Ignite C# client support distributed queues?

Oops, I read wrong! This is not supported. There is a ticket, but it
doesn't seem to be active at the moment:
https://issues.apache.org/jira/browse/IGNITE-1417

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

[jira] [Created] (IGNITE-7083) Reduce memory usage of CachePartitionFullCountersMap

2017-11-30 Thread Sunny Chan (JIRA)

Sunny Chan created IGNITE-7083:
--

 Summary: Reduce memory usage of CachePartitionFullCountersMap
 Key: IGNITE-7083
 URL: https://issues.apache.org/jira/browse/IGNITE-7083
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Affects Versions: 2.3
 Environment: Any
Reporter: Sunny Chan


The Cache Partition Exchange Manager kept a copy of the already completed 
exchange. However, we have found that it uses a significant amount of memory. 
Upon further investigation using heap dump we have found that a large amount of 
memory is used by the CachePartitionFullCountersMap. We have also observed in 
most cases, these maps contains only 0s.

Therefore I propose an optimization for this: Initially the long arrays to 
store initial update counter and update counter in the CPFCM will be null, and 
when you get the value and see these tables are null then we will return 0 for 
the counter. We only allocate the long arrays when there is any non-zero 
updates to the the map.

In our tests, the amount of heap used by GridCachePartitionExchangeManager was 
around 70MB (67 copies of these CPFCM), after we apply the optimization it 
drops to around 9MB.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Re: Documentation for GA Grid: GA library to Apache Ignite

2017-11-30 Thread techbysample

Denis,

Thanks. In review, I noticed that  when I edit page:

"https://apacheignite.readme.io/v2.3/docs/genetic-algorithms;, my updates do
not appear instantly.

Is this the correct behavior of ReadMe?

I assumed that updates were 'instant' akin to blogging..

Please advise.

Regards,
Turik 



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Re: Deprecate IgniteRDD in embedded mode

2017-11-30 Thread Denis Magda

Val,

Sounds reasonable to me. The fewer useless and potentially harmful features or 
“switches” we have in Ignite the clear it will be for the user how to use us in 
a right way.

+1 for the deprecation and further removal.

—
Denis

> On Nov 30, 2017, at 3:07 PM, Valentin Kulichenko 
>  wrote:
> 
> Igniters,
> 
> Currently we claim to support IgniteRDD in two modes: standalone and
> embedded. Standalone means there is a separately running Ignite cluster,
> and Spark start client node(s) to interact with it. In embedded node
> everything runs within Spark, including Ignite server nodes that are
> started embedded into Spark executors.
> 
> The latter case doesn't really work, mainly because the lifecycle of Spark
> executors is not very predictable - Spark can start and stop them while
> application is running. In case Ignite cluster is used to store data (which
> is usually the case), this causes unnecessary rebalancing or even
> unexpected data loss.
> 
> I propose to deprecate and eventually discontinue the embedded mode.
> Luckily, standalone mode is the default one, so we can simply print out a
> clear warning if one switches to embedded mode, and also mention this in
> the docs.
> 
> Thoughts? If there are no objections, I will create a ticket and make the
> change.
> 
> -Val

Re: ci.ignite.apache: Run all for patch (jira number)

2017-11-30 Thread Denis Magda

Cergey,

You need to run the tests agains the pull-request. In your case it should be 
this one - pull/2970/merge

—
Denis

> On Nov 30, 2017, at 12:36 PM, Cergey  wrote:
> 
> Probably, I missed something in the patch 
> (https://issues.apache.org/jira/browse/IGNITE-6745 ) as builds do not (and 
> did not) start automatically. What may be wrong ? 
> 
> -Original Message-
> From: Cergey [mailto:cossa...@mail.ru.INVALID] 
> Sent: Friday, December 1, 2017 12:31 AM
> To: dev@ignite.apache.org
> Subject: ci.ignite.apache: Run all for patch (jira number)
> 
> Hi, igniters,
> 
> When trying to run "Run all for patch" build with parameter "jira number" 
> (existing, e.g. IGNITE-6745 or random), build fails 
> (https://ci.ignite.apache.org/viewType.html?buildTypeId=Ignite20Tests_RunAllTestBuilds)
>   with exception: java.net.ConnectException: Connection timed out. Is this 
> functionality not supported or I incorrectly set the jira number ?
> 
>

Re: Internal problems requiring graceful node shutdown, reboot, etc.

2017-11-30 Thread Denis Magda

Hi Dmitriy,

I’m totally for the FailureProcessingPolicy addition to IgniteConfiguration. 

Apart of this, may I ask you to create corresponding documentation tickets for 
2.4 release and “documentation” component? Only for the improvements that are 
getting into the next release. Basically you can aggregate them if it helps. 
Feel free to assign the ticket on me right away.

—
Denis

> On Nov 30, 2017, at 10:31 AM, Дмитрий Сорокин  
> wrote:
> 
> Hi, Igniters!
> 
> We have a set of internal problems, which required graceful node shutdown,
> or other reaction configured (See discussion thread
> http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-Enhancement-Proposal-7-Internal-problems-detection-td24460.html
> ):
> - IgniteOutOfMemoryException -
> https://issues.apache.org/jira/browse/IGNITE-6892
> - Persistence errors - https://issues.apache.org/jira/browse/IGNITE-6891
> - ExchangeWorker exits with error -
> https://issues.apache.org/jira/browse/IGNITE-6890
> 
> First, I propose reconsider 3rd problem as "System worker exit while node
> still running (node stopping process has not been started)", because we
> have at least 5 worker classes, which running is critical for node working.
> 
> These workers are:
> - partition-exchanger (ExchangeWorker)
> - disco-event-worker
> - nio-acceptor
> - grid-nio-worker-tcp-comm-*
> - grid-timeout-worker
> 
> Second, I propose to use FailureProcessingPolicy (already implemented in
> scope of task IGNITE-6890) for reaction definition on 1st and 2nd detected
> problems too. This policy can be configured similar to SegmentationPolicy
> in IgniteConfiguration.
> 
> Opinions?

Re: Deprecate IgniteRDD in embedded mode

2017-11-30 Thread Holden Karau

So for what it's worth more and more of Spark's own services have also
moved to be in separate processes, and with the increased work around
scaling the executors are going to continue this trend.

On Thu, Nov 30, 2017 at 3:07 PM, Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Igniters,
>
> Currently we claim to support IgniteRDD in two modes: standalone and
> embedded. Standalone means there is a separately running Ignite cluster,
> and Spark start client node(s) to interact with it. In embedded node
> everything runs within Spark, including Ignite server nodes that are
> started embedded into Spark executors.
>
> The latter case doesn't really work, mainly because the lifecycle of Spark
> executors is not very predictable - Spark can start and stop them while
> application is running. In case Ignite cluster is used to store data (which
> is usually the case), this causes unnecessary rebalancing or even
> unexpected data loss.
>
> I propose to deprecate and eventually discontinue the embedded mode.
> Luckily, standalone mode is the default one, so we can simply print out a
> clear warning if one switches to embedded mode, and also mention this in
> the docs.
>
> Thoughts? If there are no objections, I will create a ticket and make the
> change.
>
> -Val
>



-- 
Twitter: https://twitter.com/holdenkarau

Deprecate IgniteRDD in embedded mode

2017-11-30 Thread Valentin Kulichenko

Igniters,

Currently we claim to support IgniteRDD in two modes: standalone and
embedded. Standalone means there is a separately running Ignite cluster,
and Spark start client node(s) to interact with it. In embedded node
everything runs within Spark, including Ignite server nodes that are
started embedded into Spark executors.

The latter case doesn't really work, mainly because the lifecycle of Spark
executors is not very predictable - Spark can start and stop them while
application is running. In case Ignite cluster is used to store data (which
is usually the case), this causes unnecessary rebalancing or even
unexpected data loss.

I propose to deprecate and eventually discontinue the embedded mode.
Luckily, standalone mode is the default one, so we can simply print out a
clear warning if one switches to embedded mode, and also mention this in
the docs.

Thoughts? If there are no objections, I will create a ticket and make the
change.

-Val

Re: Replicated with CacheStore and Read-Through works like Partitioned with no backups.

2017-11-30 Thread Dmitriy Setrakyan

It sounds like a bug, get() from backup should update the local map.
Andrey, please feel free to file a ticket.

On Thu, Nov 30, 2017 at 1:41 AM, Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Hi Igniters,
>
> Ignite Replicated cache with CacheStore and Read-through=true and
> ReadFromBackup=true works in unexpected way from user perspective.
>
> Imagine a case user have a large dataset in underlying db and want to cache
> hot data only, so there is no need to do cache.loadCache().
>
> User expectation is when he get an entry from Ignite, Ignite go to the
> CacheStore and saves entry locally (as readthrough=true) and
> subsequent get() of same entry will return result from local map.
>
> For now it is true if user call get() on primary node.
> But it works different and unexpected way on backup, where every get
> operation will go to the primary node regardless of ReadFromBackup=true and
> Read-through=true.
>
> So, ReplicatedCache in that case works like Partitioned with no backups.
>
> Of course, I understand it will be more weird behavior if get() operation
> will cause cluster wide operation for update backups when read-through
> happened.
> But why get() from backup doesn't update local cache map once entry from
> primary requested?
>
> Can we fix this or have some additional mode to workaround this issue?
> Thoughts?
>
> --
> Best regards,
> Andrey V. Mashenkov
>

Re: TC issues. IGNITE-3084. Spark Data Frame API

2017-11-30 Thread Николай Ижиков

Valentin,

Now it's run OK.

Thank you.

30.11.2017 23:41, Valentin Kulichenko пишет:

Nikolay,

Please try once again.

-Val

On Thu, Nov 30, 2017 at 11:43 AM, Николай Ижиков > wrote:

Valentin,

Thank you, but your changes does not enough

I ran a build for my branch and still has "Unsupport major.minor version 
52.0" issue [1]

Build log:

`Starting: /usr/lib/jvm/java-7-oracle/bin/java 
-DJAVA_HOME=/usr/lib/jvm/java-8-oracle`

I look to build settings and found some variables that sill points to jdk7:

Environment variables:

env.JAVA_HOME   /usr/lib/jvm/java-7-oracle
env.JDK_HOME    /usr/lib/jvm/java-7-oracle

Can you, please, change this variables too?

[1] 
https://ci.ignite.apache.org/viewLog.html?buildId=970913=Ignite20Tests_IgniteRdd=buildLog

30.11.2017 22:16, Valentin Kulichenko пишет:

Nikolay,

Java 7 support will be dropped by Ignite soon, so let's do the upgrade 
now. I changed both 'Ignite RDD' and 'Ignite RDD spark 2_10' configuration on 
TC to use JDK 8. Can you try it out and let
me know if it works?

-Val

On Wed, Nov 29, 2017 at 11:28 PM, Николай Ижиков  >> wrote:

     Valentin,

     > Oh, so this is because of upgrade to 2.2.0?

     Yes, we should upgrade spark module to jdk1.8 because switching to 
spark 2.2.0.

     > least we should consider not dropping previous version yet.

     Please, note, we can have IgniteRDD and Ignite Data Frame for a 
spark 2.1.

     The only thing we can't have is IgniteCatalog.

     Currently, in my PR I include IgniteRDD and Ignite Data Frame in 
spark_2.10 module
     So we doesn't have to drop spark 2.1 completely.

https://github.com/apache/ignite/pull/2742 
 >

     30.11.2017 02:09, Valentin Kulichenko пишет:

         Nikolay,

         Oh, so this is because of upgrade to 2.2.0? Then I'm not sure 
we should upgrade in the first place, or at least we should consider not 
dropping previous version yet. I sent a message
to the
         original thread about upgrade, let's decide there and then 
come back to this issue.

         -Val

         On Tue, Nov 28, 2017 at 8:08 PM, Николай Ижиков  >

 >

>>

> 
  wrote:

                       Hello, Valentin.

                           Added '-Dscala-2.10' to the build config. 
Let me know if it helps.

                       Yes, it helps. Thank you!
                       Now, 'Ignite RDD spark 2_10' succeed for my 
branch.

                           Do you mean that IgniteRDD does not

Re: TC issues. IGNITE-3084. Spark Data Frame API

2017-11-30 Thread Valentin Kulichenko

Nikolay,

Please try once again.

-Val

On Thu, Nov 30, 2017 at 11:43 AM, Николай Ижиков 
wrote:

> Valentin,
>
> Thank you, but your changes does not enough
>
> I ran a build for my branch and still has "Unsupport major.minor version
> 52.0" issue [1]
>
> Build log:
>
> `Starting: /usr/lib/jvm/java-7-oracle/bin/java
> -DJAVA_HOME=/usr/lib/jvm/java-8-oracle`
>
> I look to build settings and found some variables that sill points to jdk7:
>
> Environment variables:
>
> env.JAVA_HOME   /usr/lib/jvm/java-7-oracle
> env.JDK_HOME/usr/lib/jvm/java-7-oracle
>
> Can you, please, change this variables too?
>
> [1] https://ci.ignite.apache.org/viewLog.html?buildId=970913
> ldTypeId=Ignite20Tests_IgniteRdd=buildLog
>
>
> 30.11.2017 22:16, Valentin Kulichenko пишет:
>
>> Nikolay,
>>
>> Java 7 support will be dropped by Ignite soon, so let's do the upgrade
>> now. I changed both 'Ignite RDD' and 'Ignite RDD spark 2_10' configuration
>> on TC to use JDK 8. Can you try it out and let me know if it works?
>>
>> -Val
>>
>> On Wed, Nov 29, 2017 at 11:28 PM, Николай Ижиков > > wrote:
>>
>> Valentin,
>>
>> > Oh, so this is because of upgrade to 2.2.0?
>>
>> Yes, we should upgrade spark module to jdk1.8 because switching to
>> spark 2.2.0.
>>
>> > least we should consider not dropping previous version yet.
>>
>> Please, note, we can have IgniteRDD and Ignite Data Frame for a spark
>> 2.1.
>>
>> The only thing we can't have is IgniteCatalog.
>>
>> Currently, in my PR I include IgniteRDD and Ignite Data Frame in
>> spark_2.10 module
>> So we doesn't have to drop spark 2.1 completely.
>>
>> https://github.com/apache/ignite/pull/2742 <
>> https://github.com/apache/ignite/pull/2742>
>>
>> 30.11.2017 02:09, Valentin Kulichenko пишет:
>>
>> Nikolay,
>>
>> Oh, so this is because of upgrade to 2.2.0? Then I'm not sure we
>> should upgrade in the first place, or at least we should consider not
>> dropping previous version yet. I sent a message to the
>> original thread about upgrade, let's decide there and then come
>> back to this issue.
>>
>> -Val
>>
>> On Tue, Nov 28, 2017 at 8:08 PM, Николай Ижиков <
>> nizhikov@gmail.com  > nizhikov@gmail.com >> wrote:
>>
>>  Valentin,
>>
>>  For now `Ignite RDD` build runs on jdk1.7.
>>  We need to update it to jdk1.8.
>>
>>  I wrote the whole versions numbers to be clear:
>>
>>  1. Current master - Spark version is 2.1.0.
>>   So both `Ignite RDD` and `Ignite RDD 2.10` runs OK
>> on jdk1.7.
>>
>>  2. My branch -
>>   `Ignite RDD 2.10` - spark version is 2.1.2 - runs
>> OK on jdk1.7.
>>   `Ignite RDD` - spark version 2.2.0 - fails on
>> jdk1.7, *has to be changed to run on jdk1.8*
>>
>>
>>  29.11.2017 03:27, Valentin Kulichenko пишет:
>>
>>  Nikolay,
>>
>>  If Spark requires Java 8, then I guess we have no
>> choice. How TC is configured at the moment? My understanding is that Spark
>> related suites are successfully executed there, so is
>> there an issue?
>>
>>  -Val
>>
>>  On Tue, Nov 28, 2017 at 2:42 AM, Николай Ижиков <
>> nizhikov@gmail.com  > nizhikov@gmail.com >
>> 
>> >
>>   Hello, Valentin.
>>
>>   Added '-Dscala-2.10' to the build config. Let
>> me know if it helps.
>>
>>
>>   Yes, it helps. Thank you!
>>   Now, 'Ignite RDD spark 2_10' succeed for my branch.
>>
>>
>>   Do you mean that IgniteRDD does not compile on
>> JDK7? If yes, do we know the reason? I don't think switching it to JDK8 is
>> a solution as it should work with both.
>>
>>
>>   I mean that latest version of spark doesn't support
>> jdk7.
>>
>> http://spark.apache.org/docs/latest/ <
>> http://spark.apache.org/docs/latest/> > latest/ > <
>> http://spark.apache.org/docs/latest/
>>  <
>> http://spark.apache.org/docs/latest/ > latest/>>>
>>
>>   "Spark runs on Java 8+..."
>>   "For the Scala API, Spark 2.2.0 uses Scala 2.11..."
>>   "Note that support for Java 7... were removed as of
>> Spark 2.2.0"
>>   "Note that support for Scala 2.10 is deprecated..."
>>
>>   Moreover, We can't have

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-30 Thread Valentin Kulichenko

Great! Let me know if you need any assistance and/or intermediate review.

-Val

On Thu, Nov 30, 2017 at 12:05 AM, Николай Ижиков 
wrote:

> Valentin,
>
> > Can you please create a separate ticket for the strategy implementation
> then?
>
> Done.
>
> https://issues.apache.org/jira/browse/IGNITE-7077
>
> > Any idea on how long will it take?
>
> I think it will take 2-4 weeks to implement such a strategy.
> I try my best to make a ready to review PR before the end of the year.
>
>
> 30.11.2017 02:13, Valentin Kulichenko пишет:
>
> Nikolay,
>>
>> Can you please create a separate ticket for the strategy implementation
>> then? Any idea on how long will it take?
>>
>> As for querying a partition, both SqlQuery and SqlFieldQuery allow to
>> specify set of partitions to work with (see setPartitions method). I think
>> that should be enough.
>>
>> -Val
>>
>> On Wed, Nov 29, 2017 at 3:39 AM, Vladimir Ozerov 
>> wrote:
>>
>> Hi Nikolay,
>>>
>>> No, it is not possible to get this info from public API, neither we
>>> planned
>>> to expose it. See IGNITE-4509 and commit *fbf0e353* to get better
>>> understanding on how this was implemented.
>>>
>>> Vladimir.
>>>
>>> On Wed, Nov 29, 2017 at 2:01 PM, Николай Ижиков 
>>> wrote:
>>>
>>> Hello, Vladimir.

 partition pruning is already implemented in Ignite, so there is no need
>
 to do this on your own.

 Spark work with partitioned data set.
 It is required to provide data partition information to Spark from
 custom
 Data Source(Ignite).

 Can I get information about pruned partitions throw some public API?
 Is there a plan or ticket to implement such API?



 2017-11-29 10:34 GMT+03:00 Vladimir Ozerov :

 Nikolay,
>
> Regarding p3. - partition pruning is already implemented in Ignite, so
> there is no need to do this on your own.
>
> On Wed, Nov 29, 2017 at 3:23 AM, Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> Nikolay,
>>
>> Custom strategy allows to fully process the AST generated by Spark
>>
> and
>>>
 convert it to Ignite SQL, so there will be no execution on Spark side
>>
> at

> all. This is what we are trying to achieve here. Basically, one will
>>
> be
>>>
 able to use DataFrame API to execute queries directly on Ignite. Does
>>
> it

> make sense to you?
>>
>> I would recommend you to take a look at MemSQL implementation which
>>
> does

> similar stuff: https://github.com/memsql/memsql-spark-connector
>>
>> Note that this approach will work only if all relations included in
>>
> AST
>>>
 are
>
>> Ignite tables. Otherwise, strategy should return null so that Spark
>>
> falls

> back to its regular mode. Ignite will be used as regular data source
>>
> in
>>>
 this case, and probably it's possible to implement some optimizations
>>
> here
>
>> as well. However, I never investigated this and it seems like another
>> separate discussion.
>>
>> -Val
>>
>> On Tue, Nov 28, 2017 at 9:54 AM, Николай Ижиков <
>>
> nizhikov@gmail.com>

> wrote:
>>
>> Hello, guys.
>>>
>>> I have implemented basic support of Spark Data Frame API [1], [2]
>>>
>> for
>>>
 Ignite.
>>> Spark provides API for a custom strategy to optimize queries from
>>>
>> spark

> to
>>
>>> underlying data source(Ignite).
>>>
>>> The goal of optimization(obvious, just to be on the same page):
>>> Minimize data transfer between Spark and Ignite.
>>> Speedup query execution.
>>>
>>> I see 3 ways to optimize queries:
>>>
>>>  1. *Join Reduce* If one make some query that join two or
>>>
>> more
>>>
 Ignite tables, we have to pass all join info to Ignite and transfer
>>>
>> to

> Spark only result of table join.
>>>  To implement it we have to extend current implementation
>>>
>> with
>>>
 new
>
>> RelationProvider that can generate all kind of joins for two or
>>>
>> more
>>>
 tables.
>>
>>>  We should add some tests, also.
>>>  The question is - how join result should be partitioned?
>>>
>>>
>>>  2. *Order by* If one make some query to Ignite table with
>>>
>> order

> by
>>
>>> clause we can execute sorting on Ignite side.
>>>  But it seems that currently Spark doesn’t have any way to
>>>
>> tell

> that partitions already sorted.
>>>
>>>
>>>  3. *Key filter* If one make query with `WHERE key = XXX` or
>>>
>> `WHERE
>>
>>> key IN (X, Y, Z)`, we can reduce number of partitions.
>>>  And query

RE: ci.ignite.apache: Run all for patch (jira number)

2017-11-30 Thread Cergey

Probably, I missed something in the patch 
(https://issues.apache.org/jira/browse/IGNITE-6745 ) as builds do not (and did 
not) start automatically. What may be wrong ? 

-Original Message-
From: Cergey [mailto:cossa...@mail.ru.INVALID] 
Sent: Friday, December 1, 2017 12:31 AM
To: dev@ignite.apache.org
Subject: ci.ignite.apache: Run all for patch (jira number)

Hi, igniters,

When trying to run "Run all for patch" build with parameter "jira number" 
(existing, e.g. IGNITE-6745 or random), build fails 
(https://ci.ignite.apache.org/viewType.html?buildTypeId=Ignite20Tests_RunAllTestBuilds)
  with exception: java.net.ConnectException: Connection timed out. Is this 
functionality not supported or I incorrectly set the jira number ?

ci.ignite.apache: Run all for patch (jira number)

2017-11-30 Thread Cergey

Hi, igniters,

When trying to run "Run all for patch" build with parameter "jira number" 
(existing, e.g. IGNITE-6745 or random), build fails 
(https://ci.ignite.apache.org/viewType.html?buildTypeId=Ignite20Tests_RunAllTestBuilds)
  with exception: java.net.ConnectException: Connection timed out. Is this 
functionality not supported or I incorrectly set the jira number ?

Re: Time to drop Java 7?

2017-11-30 Thread Dmitriy Setrakyan

+1

On Thu, Nov 30, 2017 at 11:29 AM, Denis Magda  wrote:

> Igniters,
>
> Considering that we’re going to support Java 9 in the next release and
> hitting several limitations caused by current Java 7 support (see spark
> issue [2] and account difficulties of Java 8 based ML development) I would
> propose to discontinue Java 7.
>
> Let’s do both things in AI 2.4: Java 9 support and Java 7 removal.
>
> Share your thoughts on this?
>
> [1] http://apache-ignite-developers.2346864.n4.nabble.
> com/Java-9-support-td23612.html
> [2] http://apache-ignite-developers.2346864.n4.nabble.
> com/TC-issues-IGNITE-3084-Spark-Data-Frame-API-td24639.html <
> http://apache-ignite-developers.2346864.n4.nabble.
> com/TC-issues-IGNITE-3084-Spark-Data-Frame-API-td24639.html>
>
> —
> Denis

Re: TC issues. IGNITE-3084. Spark Data Frame API

2017-11-30 Thread Николай Ижиков

Valentin,

Thank you, but your changes does not enough

I ran a build for my branch and still has "Unsupport major.minor version 52.0" 
issue [1]

Build log:

`Starting: /usr/lib/jvm/java-7-oracle/bin/java 
-DJAVA_HOME=/usr/lib/jvm/java-8-oracle`

I look to build settings and found some variables that sill points to jdk7:

Environment variables:

env.JAVA_HOME   /usr/lib/jvm/java-7-oracle
env.JDK_HOME/usr/lib/jvm/java-7-oracle

Can you, please, change this variables too?

[1] 
https://ci.ignite.apache.org/viewLog.html?buildId=970913=Ignite20Tests_IgniteRdd=buildLog

30.11.2017 22:16, Valentin Kulichenko пишет:

Nikolay,

Java 7 support will be dropped by Ignite soon, so let's do the upgrade now. I changed both 'Ignite RDD' and 'Ignite RDD spark 2_10' configuration on TC to use JDK 8. Can you try it out and let me know 
if it works?

-Val

On Wed, Nov 29, 2017 at 11:28 PM, Николай Ижиков > wrote:

Valentin,

> Oh, so this is because of upgrade to 2.2.0?

Yes, we should upgrade spark module to jdk1.8 because switching to spark 
2.2.0.

> least we should consider not dropping previous version yet.

Please, note, we can have IgniteRDD and Ignite Data Frame for a spark 2.1.

The only thing we can't have is IgniteCatalog.

Currently, in my PR I include IgniteRDD and Ignite Data Frame in spark_2.10 
module
So we doesn't have to drop spark 2.1 completely.

https://github.com/apache/ignite/pull/2742 

30.11.2017 02:09, Valentin Kulichenko пишет:

Nikolay,

Oh, so this is because of upgrade to 2.2.0? Then I'm not sure we should 
upgrade in the first place, or at least we should consider not dropping 
previous version yet. I sent a message to the
original thread about upgrade, let's decide there and then come back to 
this issue.

-Val

On Tue, Nov 28, 2017 at 8:08 PM, Николай Ижиков  >> wrote:

     Valentin,

     For now `Ignite RDD` build runs on jdk1.7.
     We need to update it to jdk1.8.

     I wrote the whole versions numbers to be clear:

     1. Current master - Spark version is 2.1.0.
              So both `Ignite RDD` and `Ignite RDD 2.10` runs OK on 
jdk1.7.

     2. My branch -
              `Ignite RDD 2.10` - spark version is 2.1.2 - runs OK on 
jdk1.7.
              `Ignite RDD` - spark version 2.2.0 - fails on jdk1.7, 
*has to be changed to run on jdk1.8*

     29.11.2017 03:27, Valentin Kulichenko пишет:

         Nikolay,

         If Spark requires Java 8, then I guess we have no choice. How 
TC is configured at the moment? My understanding is that Spark related suites 
are successfully executed there, so is
there an issue?

         -Val

         On Tue, Nov 28, 2017 at 2:42 AM, Николай Ижиков  >

> 
 >>

              "Spark runs on Java 8+..."
              "For the Scala API, Spark 2.2.0 uses Scala 2.11..."
              "Note that support for Java 7... were removed as of Spark 
2.2.0"
              "Note that support for Scala 2.10 is deprecated..."

              Moreover, We can't have IgniteCatalog for spark 2.1.
              Please, see my explanation in jira ticket -

https://issues.apache.org/jira/browse/IGNITE-3084?focusedCommentId=16268523=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16268523

Time to drop Java 7?

2017-11-30 Thread Denis Magda

Igniters,

Considering that we’re going to support Java 9 in the next release and hitting 
several limitations caused by current Java 7 support (see spark issue [2] and 
account difficulties of Java 8 based ML development) I would propose to 
discontinue Java 7.

Let’s do both things in AI 2.4: Java 9 support and Java 7 removal.

Share your thoughts on this?

[1] 
http://apache-ignite-developers.2346864.n4.nabble.com/Java-9-support-td23612.html
[2] 
http://apache-ignite-developers.2346864.n4.nabble.com/TC-issues-IGNITE-3084-Spark-Data-Frame-API-td24639.html
 


—
Denis

Re: TC issues. IGNITE-3084. Spark Data Frame API

2017-11-30 Thread Valentin Kulichenko

Nikolay,

Java 7 support will be dropped by Ignite soon, so let's do the upgrade now.
I changed both 'Ignite RDD' and 'Ignite RDD spark 2_10' configuration on TC
to use JDK 8. Can you try it out and let me know if it works?

-Val

On Wed, Nov 29, 2017 at 11:28 PM, Николай Ижиков 
wrote:

> Valentin,
>
> > Oh, so this is because of upgrade to 2.2.0?
>
> Yes, we should upgrade spark module to jdk1.8 because switching to spark
> 2.2.0.
>
> > least we should consider not dropping previous version yet.
>
> Please, note, we can have IgniteRDD and Ignite Data Frame for a spark 2.1.
>
> The only thing we can't have is IgniteCatalog.
>
> Currently, in my PR I include IgniteRDD and Ignite Data Frame in
> spark_2.10 module
> So we doesn't have to drop spark 2.1 completely.
>
> https://github.com/apache/ignite/pull/2742
>
> 30.11.2017 02:09, Valentin Kulichenko пишет:
>
>> Nikolay,
>>
>> Oh, so this is because of upgrade to 2.2.0? Then I'm not sure we should
>> upgrade in the first place, or at least we should consider not dropping
>> previous version yet. I sent a message to the original thread about
>> upgrade, let's decide there and then come back to this issue.
>>
>> -Val
>>
>> On Tue, Nov 28, 2017 at 8:08 PM, Николай Ижиков > > wrote:
>>
>> Valentin,
>>
>> For now `Ignite RDD` build runs on jdk1.7.
>> We need to update it to jdk1.8.
>>
>> I wrote the whole versions numbers to be clear:
>>
>> 1. Current master - Spark version is 2.1.0.
>>  So both `Ignite RDD` and `Ignite RDD 2.10` runs OK on jdk1.7.
>>
>> 2. My branch -
>>  `Ignite RDD 2.10` - spark version is 2.1.2 - runs OK on
>> jdk1.7.
>>  `Ignite RDD` - spark version 2.2.0 - fails on jdk1.7, *has
>> to be changed to run on jdk1.8*
>>
>>
>> 29.11.2017 03:27, Valentin Kulichenko пишет:
>>
>> Nikolay,
>>
>> If Spark requires Java 8, then I guess we have no choice. How TC
>> is configured at the moment? My understanding is that Spark related suites
>> are successfully executed there, so is there an issue?
>>
>> -Val
>>
>> On Tue, Nov 28, 2017 at 2:42 AM, Николай Ижиков <
>> nizhikov@gmail.com  > nizhikov@gmail.com >> wrote:
>>
>>  Hello, Valentin.
>>
>>  Added '-Dscala-2.10' to the build config. Let me know if
>> it helps.
>>
>>
>>  Yes, it helps. Thank you!
>>  Now, 'Ignite RDD spark 2_10' succeed for my branch.
>>
>>
>>  Do you mean that IgniteRDD does not compile on JDK7? If
>> yes, do we know the reason? I don't think switching it to JDK8 is a
>> solution as it should work with both.
>>
>>
>>  I mean that latest version of spark doesn't support jdk7.
>>
>> http://spark.apache.org/docs/latest/ <
>> http://spark.apache.org/docs/latest/> > latest/ >
>>
>>  "Spark runs on Java 8+..."
>>  "For the Scala API, Spark 2.2.0 uses Scala 2.11..."
>>  "Note that support for Java 7... were removed as of Spark
>> 2.2.0"
>>  "Note that support for Scala 2.10 is deprecated..."
>>
>>  Moreover, We can't have IgniteCatalog for spark 2.1.
>>  Please, see my explanation in jira ticket -
>>
>> https://issues.apache.org/jira/browse/IGNITE-3084?focusedCom
>> mentId=16268523=com.atlassian.jira.plugin.system.
>> issuetabpanels:comment-tabpanel#comment-16268523
>> > mmentId=16268523=com.atlassian.jira.plugin.system.
>> issuetabpanels:comment-tabpanel#comment-16268523>
>>  > mmentId=16268523=com.atlassian.jira.plugin.system.
>> issuetabpanels:comment-tabpanel#comment-16268523
>> > mmentId=16268523=com.atlassian.jira.plugin.system.
>> issuetabpanels:comment-tabpanel#comment-16268523>>
>>
>>  Do you see any options to support jdk7 for spark module?
>>
>>  > I think all tests should be executed on TC. Can you check
>> if they work and add them to corresponding suites
>>
>>  OK, I file a ticket and try to fix it shortly.
>>
>> https://issues.apache.org/jira/browse/IGNITE-7042 <
>> https://issues.apache.org/jira/browse/IGNITE-7042> <
>> https://issues.apache.org/jira/browse/IGNITE-7042
>> >
>>
>>  28.11.2017 03:33, Valentin Kulichenko пишет:
>>
>>  Hi Nikolay,
>>
>>  Please see my responses inline.
>>
>>  -Val
>>
>>  On Fri, Nov 24, 2017 at 2:55 AM, Николай Ижиков <
>> nizhikov@gmail.com  >

Internal problems requiring graceful node shutdown, reboot, etc.

2017-11-30 Thread Дмитрий Сорокин

Hi, Igniters!

We have a set of internal problems, which required graceful node shutdown,
or other reaction configured (See discussion thread
http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-Enhancement-Proposal-7-Internal-problems-detection-td24460.html
):
- IgniteOutOfMemoryException -
https://issues.apache.org/jira/browse/IGNITE-6892
- Persistence errors - https://issues.apache.org/jira/browse/IGNITE-6891
- ExchangeWorker exits with error -
https://issues.apache.org/jira/browse/IGNITE-6890

First, I propose reconsider 3rd problem as "System worker exit while node
still running (node stopping process has not been started)", because we
have at least 5 worker classes, which running is critical for node working.

These workers are:
- partition-exchanger (ExchangeWorker)
- disco-event-worker
- nio-acceptor
- grid-nio-worker-tcp-comm-*
- grid-timeout-worker

Second, I propose to use FailureProcessingPolicy (already implemented in
scope of task IGNITE-6890) for reaction definition on 1st and 2nd detected
problems too. This policy can be configured similar to SegmentationPolicy
in IgniteConfiguration.

Opinions?

RE: Adding Persistent Memory Support for Ignite

2017-11-30 Thread Mammo, Mulugeta

Hi Dmitriy,

We're still working on the documentation of LLPL. You may forward me your 
questions in the meantime and I'll try to answer them.

Thanks,
Mulugeta

-Original Message-
From: Dmitriy Setrakyan [mailto:dsetrak...@apache.org] 
Sent: Thursday, November 16, 2017 1:26 PM
To: dev@ignite.apache.org
Subject: Re: Adding Persistent Memory Support for Ignite

Hi Mulugeta,

Where can I find documentation about LLPL to understand how memory and 
persistence are handled there?

D.

On Thu, Nov 16, 2017 at 7:16 AM, Mammo, Mulugeta 
wrote:

> Hi all,
>
>
>
> Ignite, when persistence mode is enabled, stores data and indexes on disk.
> To minimize the latency of disks, several tuning options can be applied.
> Setting the page size of a memory region to match the page size of the 
> underlying storage, using a separate disk for the WAL, and using 
> production-level SSDs are just a few of them [ 
> https://apacheignite.readme.io/docs/durable-memory-tuning#
> section-native-persistence-related-tuning ].
>
> A persistent memory store with low latency and high capacity offers a 
> viable alternative to disks. In light of this, we are proposing to 
> make use of our Low Level Persistent Library (LLPL), 
> https://github.com/pmem/pcj/ tree/master/LLPL, to offer a persistent memory 
> storage for Ignite.
>
> At this point, we envision two distinct implementation options:
>
> 1.  Data and indexes will continue to be stored in the off-heap memory
> but the disk will be replaced by a persistent memory. Since 
> persistence memory in this option is not a file system, the logic 
> currently offered by WAL file and the partition files would have to be 
> implemented from scratch.
>
> 2.  In this option, we eliminate the current check-point process and
> the WAL file. We will use a memory region defined by LLPL to store 
> data and indexes. There will be no off-heap memory. DRAM will be 
> exclusively used to store hot cache entries just like the on-heap 
> cache is in the current implementation.
>
> In both cases, there are more details and subtleties that have to 
> handled
> - e.g. the atomic and transactional guarantees offered. More 
> clarifications will be given as we go along. And, feel free to provide your 
> thoughts.
>
> Thanks,
> Mulugeta
>
>

[jira] [Created] (IGNITE-7082) .NET: Cross-platform standalone executable

2017-11-30 Thread Pavel Tupitsyn (JIRA)

Pavel Tupitsyn created IGNITE-7082:
--

 Summary: .NET: Cross-platform standalone executable
 Key: IGNITE-7082
 URL: https://issues.apache.org/jira/browse/IGNITE-7082
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Affects Versions: 2.4
Reporter: Pavel Tupitsyn


Apache.Ignite.Core.dll works on all platforms (IGNITE-2662), but it is just a 
library to be referenced by user code.

We also provide Apache.Ignite.exe which is windows-only, .NET Core can not run 
exe files.

Find a way to provide cross-platform standalone executable to be run on .NET 
Core. That could be a prebuilt .NET Core dll or a project to be started with 
{{dotnet run}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-7081) Increase in partition count silently breaks persistent cache reads and writes

2017-11-30 Thread Joel Lang (JIRA)

Joel Lang created IGNITE-7081:
-

 Summary: Increase in partition count silently breaks persistent 
cache reads and writes
 Key: IGNITE-7081
 URL: https://issues.apache.org/jira/browse/IGNITE-7081
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.3
Reporter: Joel Lang
Priority: Minor


An increase in the partition count for a cache to even out distribution between 
nodes lead to bad, inconsistent behavior in the cache.

Gets on known keys would return null due to the partition number being 
different even as SQL queries would still find the cache entry through its own 
means.

Removals of these cache entries using SQL would also fail.

It took several hours to track down the issue because of the inconsistency of 
the behavior between SQL queries and a call to get().

Changing the partition count would not have been an issue before we used native 
persistence but now it is.

I believe the solution is to have a more rigid verification of the stored cache 
configuration against the live cache configuration when the cache starts. It 
should fail if any configuration changes are made that would cause problems. It 
also makes me wonder what other changes are safe or not to make to a cache 
configuration that is persistent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-7080) YARN fails to create containers if Bash functions exported in environment

2017-11-30 Thread Ilya Kasnacheev (JIRA)

Ilya Kasnacheev created IGNITE-7080:
---

 Summary: YARN fails to create containers if Bash functions 
exported in environment
 Key: IGNITE-7080
 URL: https://issues.apache.org/jira/browse/IGNITE-7080
 Project: Ignite
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.3
Reporter: Ilya Kasnacheev
Assignee: Ilya Kasnacheev


Ignite YARN collects all existing environment variables to pass them to 
container, including variables with incorrect names, such as Bash functions, 
which have extra characters at the end, and are ignored by most shells but not 
the JVM.

When you tell Bash to export functions, it puts BASH_FUNC_your_function_name%% 
variable into env. This is what is causing problems because Ignite YARN picks 
this variable and tells Hadoop to pass it to container, which leads to 
incorrectly written startup scrips.

Hadoop tries to sanitize env var values but not env var names. I think Ignite 
should not try to pass all env vars to containers (it may contain sensitive or 
master-specific vars!). We should only pass env vars that are relevant to 
containers, such as IGNITE_* vars.

See 
http://apache-ignite-users.70518.x6.nabble.com/Error-running-ignite-in-YARN-td18280.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] ignite pull request #3035: IGNITE-6867 Implement new JMX metrics for topolog...

2017-11-30 Thread xtern

Github user xtern closed the pull request at:

https://github.com/apache/ignite/pull/3035


---

[GitHub] ignite pull request #3118: ignite-7049

2017-11-30 Thread ascherbakoff

GitHub user ascherbakoff opened a pull request:

https://github.com/apache/ignite/pull/3118

ignite-7049

Optimistic transaction is not properly rolled back if timed out before 
sending prepare response.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-7049

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3118


commit 5f5a4e7e7cd1e630be7e6934748b02c72956923a
Author: Aleksei Scherbakov 
Date:   2017-11-29T17:23:49Z

IGNITE-7049 wip.

commit 077c930c90961e9cb5b4ce4182f0292841f5d206
Author: Aleksei Scherbakov 
Date:   2017-11-30T13:10:49Z

IGNITE-7049 wip.

commit cce3fb2b65eac6ec2587c970d078cae6889d308a
Author: Aleksei Scherbakov 
Date:   2017-11-30T13:12:25Z

IGNITE-7049 wip.




---

[GitHub] ignite pull request #3106: IGNITE-7070: Ignite PDS compatibilty framework fi...

2017-11-30 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3106


---

[GitHub] ignite pull request #3098: IGNITE-7071: Batch cache destroy requests added

2017-11-30 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3098


---

[GitHub] ignite pull request #3067: IGNITE-6955 Update com.google.code.simple-spring-...

2017-11-30 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3067


---

[GitHub] ignite pull request #2980: IGNITE-6828 Confusing messages SLF4J: Failed to l...

2017-11-30 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/2980


---

[GitHub] ignite pull request #3117: IGNITE-6880: KNN Regression and Classification

2017-11-30 Thread zaleslaw

GitHub user zaleslaw opened a pull request:

https://github.com/apache/ignite/pull/3117

IGNITE-6880: KNN Regression and Classification

Also added Labeled Dataset and loading from txt file

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-6880

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3117.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3117


commit 2cb4f5f3da67ad077f837eeeda11baf44c101245
Author: zaleslaw 
Date:   2017-11-14T12:47:29Z

Refactored keys

commit 462719fef2364cd2aded23bb32edf4d0a53b9d7f
Author: zaleslaw 
Date:   2017-11-14T12:47:51Z

Added tests

commit c180c8c60a1b43a516cd7a045420bc1dc16adcdd
Author: zaleslaw 
Date:   2017-11-14T12:48:14Z

Added vectors

commit 39766ed0bfdb9a95118b7e0c59ec39a441307c0e
Author: zaleslaw 
Date:   2017-11-14T12:48:43Z

Refactored storages and matrices

commit b9deec6d3f71db8da45800ad8a6f0ec560a25294
Author: zaleslaw 
Date:   2017-11-14T12:49:11Z

Fixed issues with keys

commit 3dbbb20efccbc85bf8022b34935096d5310c9554
Author: zaleslaw 
Date:   2017-11-14T18:25:14Z

Added kNN classifier draft

commit 41ee88add4ebc31efa7a4e903ff4d78d92bd2e8a
Author: zaleslaw 
Date:   2017-11-15T16:59:44Z

Added loading from file

commit 68c82c7f3ac3c220f52f76ce0057032bc55b33fe
Author: zaleslaw 
Date:   2017-11-15T18:56:48Z

Added kNN regression draft

commit 552d9af6d9ec0ca5156a7b84457c93dc896946e3
Author: zaleslaw 
Date:   2017-11-16T11:28:41Z

Added different distances

commit a003cc22902afa2fdfcb1d90a29d07ff936a13be
Author: Zinoviev Alexey 
Date:   2017-11-21T16:32:25Z

Fix codestyle

commit 29014354df79f2ff6e46542420030c0f091566fc
Author: Zinoviev Alexey 
Date:   2017-11-26T14:01:39Z

Merge branch 'master' into ignite-6880

commit 12f081969baa25c49b0c55f74e3551c113ebb8d8
Author: Zinoviev Alexey 
Date:   2017-11-26T15:17:25Z

Added distances and test for them

commit 8a1cc76ed63c4d72a76af37423d8a197eafd6f31
Author: Zinoviev Alexey 
Date:   2017-11-27T15:44:41Z

Added LabeledDataset

commit 754208280ee929df8c317eb15c0a3071a63dc6f5
Author: Zinoviev Alexey 
Date:   2017-11-28T15:34:01Z

Added tests

commit c5c37304929d67aa9722d701d9b40b976198b812
Author: Zinoviev Alexey 
Date:   2017-11-28T15:52:53Z

Added tests

commit d282042fb4625159f65629fe1a9efa0e25831195
Author: Zinoviev Alexey 
Date:   2017-11-28T16:29:27Z

Added normalization

commit 52749bc9bcfe2d9ad50b92b0f392e2baa4a505f1
Author: Zinoviev Alexey 
Date:   2017-11-29T16:04:05Z

Added code style

commit ed78a89cf59d07de2781c755749ea97e041f8839
Author: Zinoviev Alexey 
Date:   2017-11-30T11:30:07Z

Added code style

commit 663a8e1a1ea875cfa8e7b31652a7aa7ec1acd668
Author: Zinoviev Alexey 
Date:   2017-11-30T11:36:08Z

Added code style




---

[jira] [Created] (IGNITE-7079) Add examples for kNN classification and for kNN regression

2017-11-30 Thread Aleksey Zinoviev (JIRA)

Aleksey Zinoviev created IGNITE-7079:


 Summary: Add examples for kNN classification and for kNN regression
 Key: IGNITE-7079
 URL: https://issues.apache.org/jira/browse/IGNITE-7079
 Project: Ignite
  Issue Type: Task
  Components: ml
Reporter: Aleksey Zinoviev
Assignee: Aleksey Zinoviev


Should contain 4 examples for weighted/simple versions for both algorithms

Also it should contain Normalization usage



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] ignite pull request #3105: ignite-gg-13099

2017-11-30 Thread sk0x50

Github user sk0x50 closed the pull request at:

https://github.com/apache/ignite/pull/3105


---

[jira] [Created] (IGNITE-7078) *Names() API for datastructures

2017-11-30 Thread Alexander Belyak (JIRA)

Alexander Belyak created IGNITE-7078:


 Summary: *Names() API for datastructures
 Key: IGNITE-7078
 URL: https://issues.apache.org/jira/browse/IGNITE-7078
 Project: Ignite
  Issue Type: Wish
  Components: general
Affects Versions: 2.3, 2.2, 2.1, 2.0, 1.9, 1.8, 1.7, 1.6
Reporter: Alexander Belyak


In public API we have ignite.cacheNames() method to get all cache names and 
ignite.services().serviceDescriptors() to get services, but no method for data 
structures like:
* atomicSequenceNames()
* atomicLongNames()
* atomicReferenceNames()
* atomicStampedNames()
* queueNames()
* setNames()
* semaphoreNames()
* countDownLatchNames()




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Re: Enable/disable cache statistics in runtime

2017-11-30 Thread Anton Vinogradov

Seems,

We have to use custom discovery message to are about new nodes that joining
grid.

On Thu, Nov 30, 2017 at 5:08 AM, Alexey Kuznetsov 
wrote:

> Alex,
>
> We have such issue in JIRA: https://issues.apache.org/
> jira/browse/IGNITE-369
> I think you can update its description after we agreed on implementation
> details.
>
> --
> Alexey Kuznetsov
>

Re: Transport compression (not store compression)

2017-11-30 Thread Vladimir Ozerov

I would start with communication only, and put discovery and clients aside
for now. Let's confirm that communication works well with compression first
in terms of performance.

On Thu, Nov 30, 2017 at 10:36 AM, Nikita Amelchev 
wrote:

> Hello, everybody.
> I propose the following design for network compression.
>
> I suggest to implement it like SSL implementation that already works. I add
> compression filter to a chain of NIO filters. It changes the logic of
> TcpCommunicationSpi.safeTcpHandshake() and
> DirectNioClientWorker.processWrite() methods, where depending on whether
> SSL and/or compression are turned on we compress and/or encrypt data and
> messages using compress handler. Also, I will use application buffer
> (implemented in "GridNioCompressHandler") for decompressing.
>
> For compression verification, I have to write the header with a length of
> compressed data (or get it from an algorithm headers) and a flag of
> compression disabled (for small messages). When we are reading compressed
> data from a channel, we check length, the same way as an isInboundDone flag
> is checked in SSL implementation. It is conveniently to implement this flag
> in "CompressEngine" with wrap and unwrap methods for compress and
> decompress to byte buffer. Compression settings should be placed in a
> configuration (Ignite configuration/discovery SPI and
> GridClientConfiguration).
>
> Any thoughts?
>
>
> 2017-11-28 10:39 GMT+03:00 Nikita Amelchev :
>
> > Hi,
> > I've filed a ticket [1]. I'll try to share design details in a couple of
> > days.
> >
> > 1. https://issues.apache.org/jira/browse/IGNITE-7024
> >
> > 2017-11-23 18:31 GMT+03:00 Denis Magda :
> >
> >> Nikita,
> >>
> >> Sounds like a good plan. Please share the design details prior getting
> >> down to the implementation.
> >>
> >> —
> >> Denis
> >>
> >> > On Nov 23, 2017, at 4:38 AM, Nikita Amelchev 
> >> wrote:
> >> >
> >> > Hi Igniters!
> >> >
> >> > I’m working on the similar feature for my own project.
> >> > I would like to suggest use in-line compression and write encoded
> bytes
> >> in
> >> > network channel by bytes array buffer. It allows us avoiding expensive
> >> > memory allocation.
> >> > The described design may be implemented in TcpCommunicationSpi level.
> We
> >> > can introduce pluggable compressor on TCP level where we will be able
> to
> >> > describe our compression strategy, for example, exclude some small
> >> messages
> >> > and many other.
> >> >
> >> > If the community doesn't mind I will file the ticket and will start
> >> > implementing it.
> >> > Any thoughts?
> >> >
> >> > 2017-11-23 12:06 GMT+03:00 Vladimir Ozerov :
> >> >
> >> >> Denis,
> >> >>
> >> >> Regarding zipped marshaller - this would be inefficient, because
> >> >> compression rate will be lower.
> >> >>
> >> >> On Thu, Nov 23, 2017 at 1:01 AM, Denis Magda 
> >> wrote:
> >> >>
> >> >>> Nikita,
> >> >>>
> >> >>> Your solution sounds reasonable from the first glance. However, the
> >> >>> communication layer processes a dozen of small system messages that
> >> >> should
> >> >>> be excluded from the compression. Guess, that we will spend more
> time
> >> on
> >> >>> compressing/decompressing them thus diminishing the positive effect
> of
> >> >> the
> >> >>> compression.
> >> >>>
> >> >>> Alexey K., Vladimir O.,
> >> >>>
> >> >>> What if we create Zip version of the binary marshaller the same way
> we
> >> >>> implemented GridClientZipOptimizedMarshaller?
> >> >>>
> >> >>> —
> >> >>> Denis
> >> >>>
> >>  On Nov 22, 2017, at 5:36 AM, Alexey Kuznetsov <
> akuznet...@apache.org
> >> >
> >> >>> wrote:
> >> 
> >>  I think it is very useful feature.
> >>  I also have experience when server nodes connected via fast
> network.
> >>  But client nodes via very slow network.
> >> 
> >>  I implemeted GridClientZipOptimizedMarshaller and that solved my
> >> >> issue.
> >>  But this marshaller works only with old
> >>  and org.apache.ignite.internal.client.GridClient and has a lot of
> >>  limitations.
> >>  But compression was about 6-20x times.
> >> 
> >>  We need a solution for Ignite 2.x and client nodes.
> >> 
> >> 
> >>  On Wed, Nov 22, 2017 at 7:48 PM, Nikita Amelchev <
> >> nsamelc...@gmail.com
> >> >>>
> >>  wrote:
> >> 
> >> > Hello, Igniters!
> >> >
> >> > I think it is a useful feature. I suggest to implement it to
> >> >>> communication
> >> > SPI like SSL encryption implemented. I have experience with this
> >> >> feature
> >> > and I can try to develop it.
> >> >
> >> > 2017-11-22 12:01 GMT+03:00 Alexey Kukushkin <
> >> >> kukushkinale...@gmail.com
> >>  :
> >> >
> >> >> Forwarding to DEV list: Ignite developers, could you please share
> >> >> your
> >> >> thoughts on how hard it is to extend Ignite to compress data on
>

Replicated with CacheStore and Read-Through works like Partitioned with no backups.

2017-11-30 Thread Andrey Mashenkov

Hi Igniters,

Ignite Replicated cache with CacheStore and Read-through=true and
ReadFromBackup=true works in unexpected way from user perspective.

Imagine a case user have a large dataset in underlying db and want to cache
hot data only, so there is no need to do cache.loadCache().

User expectation is when he get an entry from Ignite, Ignite go to the
CacheStore and saves entry locally (as readthrough=true) and
subsequent get() of same entry will return result from local map.

For now it is true if user call get() on primary node.
But it works different and unexpected way on backup, where every get
operation will go to the primary node regardless of ReadFromBackup=true and
Read-through=true.

So, ReplicatedCache in that case works like Partitioned with no backups.

Of course, I understand it will be more weird behavior if get() operation
will cause cluster wide operation for update backups when read-through
happened.
But why get() from backup doesn't update local cache map once entry from
primary requested?

Can we fix this or have some additional mode to workaround this issue?
Thoughts?

-- 
Best regards,
Andrey V. Mashenkov

[GitHub] ignite pull request #2929: ignite-2.1.6-b4 testing

2017-11-30 Thread dmekhanikov

Github user dmekhanikov closed the pull request at:

https://github.com/apache/ignite/pull/2929


---

[GitHub] ignite pull request #3116: ignite-6369 fix Scala version for ignite-spark de...

2017-11-30 Thread dmekhanikov

GitHub user dmekhanikov opened a pull request:

https://github.com/apache/ignite/pull/3116

ignite-6369 fix Scala version for ignite-spark dependencies



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-6369

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3116


commit 953f803eac4740296c57ccb6dd852e264b22c53d
Author: Denis Mekhanikov 
Date:   2017-11-08T13:07:21Z

ignite-6369 fix scala version for ignite-spark dependencies




---

Re: Thin Client Protocol documentation

2017-11-30 Thread Pavel Tupitsyn

Hi Alexey,

1,2,3 are related only to handshake. All other operations are consistent.

Handshake request format is dictated by existing client connector that is
shared with ODBC and JDBC clients (see
ClientListenerNioListener.onHandshake).
so we can't add magic numbers or change operation code.

But yes, we can add server version to the handshake response, and I think
this makes sense.

> 4. The same comments for success flag (1 byte) and status code (4 bytes)
in responses. Let's leave only status code.
We don't have a success flag in responses, there is just a 4-byte status
code, 0 indicates success, everything else is an error.

Thanks,
Pavel

On Thu, Nov 30, 2017 at 12:01 PM, Alexey Popov  wrote:

> Hi Pavel,
>
> Let me add my 5 cents.
>
> 1. It would be great if both Handshake request & response have some
> "magic" number (2 or 4 bytes) inside their msg body. That will simplify
> handling situations when non-Ignite client connects to Ignite server and
> vice versa.
>
> 2. It makes sense to add server version to successful Handshake response
> as well. It will help to understand & debug possible backward compatibility
> issues in the field by *.pcap logs analysis & etc.
>
> 3. Can we have a more strict header for all message types?
> As far as I understand,
> Handshake request has:
> 1) length - 4 byte
> 2) Handshake code - 1 byte
> 3) body - (length - 1) bytes
>
> while OP_CACHE_GET request has:
> 1) length - 4 byte
> 2) OP_CACHE_GET code - 2 bytes
> 3) request id - 4 bytes
> 4) body - (length - 2 - 4) bytes
>
> Why some messages have Operation code with 1 byte while others - 2 bytes?
> Why some requests/responses have request-id while others don't? Let's
> simplify parser work )
>
> 4. The same comments for success flag (1 byte) and status code (4 bytes)
> in responses. Let's leave only status code.
>
> Thank you,
> Alexey
>
> From: Pavel Tupitsyn
> Sent: Wednesday, November 22, 2017 4:04 PM
> To: dev@ignite.apache.org
> Subject: Thin Client Protocol documentation
>
> Igniters,
>
> I've put together a detailed description of our Thin Client protocol
> in form of IEP on wiki:
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> 9+Thin+Client+Protocol
>
>
> To clarify:
> - Protocol implementation is in master (see ClientMessageParser class)
> - Protocol has not been released yet, so we are free to change anything
> - Protocol is only used by .NET Thin Client for now, but is supposed to be
> used from other languages by third party contributors
> - More operations will be added in future, this is a first set of them,
> cache-related
>
>
> Please review the document and let me know your thoughts.
> Is there anything missing or wrong?
>
> We should make sure that the foundation is solid and extensible.
>
>
> Thanks,
> Pavel
>
>

[GitHub] ignite pull request #3115: IGNITE-6423: PDS could be corrupted if partition ...

2017-11-30 Thread AMashenkov

GitHub user AMashenkov opened a pull request:

https://github.com/apache/ignite/pull/3115

IGNITE-6423: PDS could be corrupted if partition have been evicted and 
owned again



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-6423-master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3115.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3115


commit 7cdcd2aa884aa4d0645bfb16004b3443b849d476
Author: Ilya Lantukh 
Date:   2017-09-18T14:36:13Z

ignite-6423 : Fixed outdated page handling.

commit a0b167c21acc1ddff228da9477c2c4e6043703fc
Author: Eduard Shangareev 
Date:   2017-11-29T08:36:22Z

IGNITE-6423 PDS could be corrupted if partition have been evicted and 
returned to node.




---

RE: Thin Client Protocol documentation

2017-11-30 Thread Alexey Popov

Hi Pavel,

Let me add my 5 cents. 

1. It would be great if both Handshake request & response have some "magic" 
number (2 or 4 bytes) inside their msg body. That will simplify handling 
situations when non-Ignite client connects to Ignite server and vice versa. 

2. It makes sense to add server version to successful Handshake response as 
well. It will help to understand & debug possible backward compatibility issues 
in the field by *.pcap logs analysis & etc. 

3. Can we have a more strict header for all message types? 
As far as I understand, 
Handshake request has: 
1) length - 4 byte 
2) Handshake code - 1 byte 
3) body - (length - 1) bytes 

while OP_CACHE_GET request has: 
1) length - 4 byte 
2) OP_CACHE_GET code - 2 bytes 
3) request id - 4 bytes 
4) body - (length - 2 - 4) bytes 

Why some messages have Operation code with 1 byte while others - 2 bytes? Why 
some requests/responses have request-id while others don't? Let's simplify 
parser work ) 

4. The same comments for success flag (1 byte) and status code (4 bytes) in 
responses. Let's leave only status code. 

Thank you,
Alexey

From: Pavel Tupitsyn
Sent: Wednesday, November 22, 2017 4:04 PM
To: dev@ignite.apache.org
Subject: Thin Client Protocol documentation

Igniters,

I've put together a detailed description of our Thin Client protocol
in form of IEP on wiki:
https://cwiki.apache.org/confluence/display/IGNITE/IEP-9+Thin+Client+Protocol


To clarify:
- Protocol implementation is in master (see ClientMessageParser class)
- Protocol has not been released yet, so we are free to change anything
- Protocol is only used by .NET Thin Client for now, but is supposed to be
used from other languages by third party contributors
- More operations will be added in future, this is a first set of them,
cache-related


Please review the document and let me know your thoughts.
Is there anything missing or wrong?

We should make sure that the foundation is solid and extensible.


Thanks,
Pavel

IGNITE-6612 is ready for review (Wrap ack methods in their own class)

2017-11-30 Thread Иван Федотов

Hi, Igniters!
I've prepared PR [1] for the issue IGNITE-6612 "Wrap ack methods in their
own class" [2] . TeamCity tests look good [3]. Could someone review it?
Thanks in advance!
[1]https://github.com/apache/ignite/pull/3046
[2]https://issues.apache.org/jira/browse/IGNITE-6612
[3]
https://ci.ignite.apache.org/project.html?projectId=Ignite20Tests_Ignite20Tests=pull%2F3046%2Fhead
-- 
Ivan Fedotov.

ivanan...@gmail.com

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-30 Thread Николай Ижиков

Valentin,

> Can you please create a separate ticket for the strategy implementation then?

Done.

https://issues.apache.org/jira/browse/IGNITE-7077

> Any idea on how long will it take?

I think it will take 2-4 weeks to implement such a strategy.
I try my best to make a ready to review PR before the end of the year.

30.11.2017 02:13, Valentin Kulichenko пишет:

Nikolay,

Can you please create a separate ticket for the strategy implementation
then? Any idea on how long will it take?

As for querying a partition, both SqlQuery and SqlFieldQuery allow to
specify set of partitions to work with (see setPartitions method). I think
that should be enough.

-Val

On Wed, Nov 29, 2017 at 3:39 AM, Vladimir Ozerov 
wrote:

Hi Nikolay,

No, it is not possible to get this info from public API, neither we planned
to expose it. See IGNITE-4509 and commit *fbf0e353* to get better
understanding on how this was implemented.

Vladimir.

On Wed, Nov 29, 2017 at 2:01 PM, Николай Ижиков 
wrote:

Hello, Vladimir.

partition pruning is already implemented in Ignite, so there is no need

to do this on your own.

Spark work with partitioned data set.
It is required to provide data partition information to Spark from custom
Data Source(Ignite).

Can I get information about pruned partitions throw some public API?
Is there a plan or ticket to implement such API?

2017-11-29 10:34 GMT+03:00 Vladimir Ozerov :

Nikolay,

Regarding p3. - partition pruning is already implemented in Ignite, so
there is no need to do this on your own.

On Wed, Nov 29, 2017 at 3:23 AM, Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

Nikolay,

Custom strategy allows to fully process the AST generated by Spark

and

convert it to Ignite SQL, so there will be no execution on Spark side

at

all. This is what we are trying to achieve here. Basically, one will

be

able to use DataFrame API to execute queries directly on Ignite. Does

it

make sense to you?

I would recommend you to take a look at MemSQL implementation which

does

similar stuff: https://github.com/memsql/memsql-spark-connector

Note that this approach will work only if all relations included in

AST

are

Ignite tables. Otherwise, strategy should return null so that Spark

falls

back to its regular mode. Ignite will be used as regular data source

in

this case, and probably it's possible to implement some optimizations

here

as well. However, I never investigated this and it seems like another
separate discussion.

-Val

On Tue, Nov 28, 2017 at 9:54 AM, Николай Ижиков <

nizhikov@gmail.com>

wrote:

Hello, guys.

I have implemented basic support of Spark Data Frame API [1], [2]

for

Ignite.
Spark provides API for a custom strategy to optimize queries from

spark

to

underlying data source(Ignite).

The goal of optimization(obvious, just to be on the same page):
Minimize data transfer between Spark and Ignite.
Speedup query execution.

I see 3 ways to optimize queries:

 1. *Join Reduce* If one make some query that join two or

more

Ignite tables, we have to pass all join info to Ignite and transfer

to

Spark only result of table join.
 To implement it we have to extend current implementation

with

new

RelationProvider that can generate all kind of joins for two or

more

tables.

 We should add some tests, also.
 The question is - how join result should be partitioned?

 2. *Order by* If one make some query to Ignite table with

order

by

clause we can execute sorting on Ignite side.
 But it seems that currently Spark doesn’t have any way to

tell

that partitions already sorted.

 3. *Key filter* If one make query with `WHERE key = XXX` or

`WHERE

key IN (X, Y, Z)`, we can reduce number of partitions.
 And query only partitions that store certain key values.
 Is this kind of optimization already built in Ignite or I

should

implement it by myself?

May be, there is any other way to make queries run faster?

[1] https://spark.apache.org/docs/latest/sql-programming-guide.

html

[2] https://github.com/apache/ignite/pull/2742

--
Nikolay Izhikov
nizhikov@gmail.com

[jira] [Created] (IGNITE-7077) Spark Data Frame Support. Convert complete query to Ignite SQL

2017-11-30 Thread Nikolay Izhikov (JIRA)

Nikolay Izhikov created IGNITE-7077:
---

 Summary: Spark Data Frame Support. Convert complete query to 
Ignite SQL
 Key: IGNITE-7077
 URL: https://issues.apache.org/jira/browse/IGNITE-7077
 Project: Ignite
  Issue Type: Task
  Components: spark
Affects Versions: 2.3
Reporter: Nikolay Izhikov
Assignee: Nikolay Izhikov
 Fix For: 2.4


Basic support of Spark Data Frame for Ignite implemented in IGNITE-3084.

We need to implement custom spark strategy that can convert whole Spark SQL 
query to Ignite SQL Query if query consists of only Ignite tables.
The strategy does nothing if spark query includes not only Ignite tables.

Memsql implementation can be taken as an example - 
https://github.com/memsql/memsql-spark-connector





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

46 matches

Mail list logo