I’m totally for the replacement of ‘crashed’ with ‘stopped’. 

As for the waiting of the checkpointing completion I would NOT do it the 
default behavior and would rather check the ‘cancel’ flag to make a decision. 
If the ‘cancel’ is ‘true’ (which is default) then we’re not going to wait for 
the completion and should print a message that   ‘cancel’ has to be set to 
‘false’ explicitly if the user prefers to wait while the checkpoint is over 
before shutting down a node.

—
Denis

> On Aug 4, 2017, at 4:31 AM, Ivan Rakov <ivan.glu...@gmail.com> wrote:
> 
> My vote is still for making message softer (crashed -> stopped) and keeping 
> logic as is.
> 
> Example with File.close() is good, but I think it's not the case here. The 
> state on disk after node stop *will not* reflect all user actions made before 
> Ignite.close() call, independent of whether node was stopped during 
> checkpoint.
> Ignite will recover to actual state anyway, the only difference is WAL replay 
> algorithm (stopping during checkpoint will force Ignite to replay delta 
> records).
> 
> However, waiting for checkpoint on node stop brings two advantages:
> 1) Next start will be faster - less WAL records to replay.
> 2) Partition files will be locally consistent after node stop. User will be 
> able to save partition file for any kind of analysis.
> 
> Are they strong enough to force user to wait on stop?
> 
> Best Regards,
> Ivan Rakov
> 
> On 04.08.2017 13:42, Vyacheslav Daradur wrote:
>> Hi guys, I'll just add my opinion if you don't mind.
>> 
>>> May be we should implement Vladimir's suggestion to flush the pages
>> without
>>> respect to the cancel flag? Are there any thoughts on this?
>> I think It's  good suggestion.
>> But in case of unit-testing a developer usually call #stopAllGrids() at the
>> end of all tests.
>> The method GridAbstactTest#stopAllGrids() is built on top of the
>> method G.stop(name,
>> true) including.
>> IMO in that case checkpoints' flushing isn't necessary.
>> 
>> 
>> 2017-08-04 13:25 GMT+03:00 Dmitry Pavlov <dpavlov....@gmail.com>:
>> 
>>> Thank you all for replies.
>>> 
>>> I like idea to replace 'crashed' to 'stop'.  'crashed' word is really
>>> confusing.
>>> 
>>> But still, if I call close () on file, all data is flushed to disk. But for
>>> ignite.close () checkpoint may be not finished.
>>> 
>>> May be we should implement Vladimir's suggestion to flush the pages without
>>> respect to the cancel flag? Are there any thoughts on this?
>>> 
>>> пт, 4 авг. 2017 г. в 11:12, Vladimir Ozerov <voze...@gridgain.com>:
>>> 
>>>> Ivan,
>>>> 
>>>> Hanging on Ignite.close() will confuse user no more than restore on start
>>>> after graceful shutdown. IMO correct approach here would be to:
>>>> 1) wait for checkpoint completion irrespective of "cancel" flag, because
>>>> this flag relates to compute jobs only as per documentation
>>>> 2) print an INFO message to the log that we are saving a checkpoint due
>>> to
>>>> node stop.
>>>> 
>>>> On Fri, Aug 4, 2017 at 10:54 AM, Ivan Rakov <ivan.glu...@gmail.com>
>>> wrote:
>>>>> Dmitriy,
>>>>> 
>>>>> From my point of view, invoking stop(true) is correct behaviour.
>>>>> 
>>>>> Stopping node in the middle of checkpoint is absolutely valid case.
>>>> That's
>>>>> how persistence works - node will restore memory state if stopped at
>>> any
>>>>> moment.
>>>>> On the other hand, checkpoint may last for a long time. Thread hanging
>>> on
>>>>> Ignite.close() may confuse user much more than "crashed in the middle
>>> of
>>>>> checkpoint" message.
>>>>> 
>>>>> Best Regards,
>>>>> Ivan Rakov
>>>>> 
>>>>> 
>>>>> On 03.08.2017 22:34, Dmitry Pavlov wrote:
>>>>> 
>>>>>> Hi Igniters,
>>>>>> 
>>>>>> I’ve created the simplest example using Ignite 2.1 and persistence
>>> (see
>>>>>> the
>>>>>> code below). I've included Ignite instance into try-with-resources (I
>>>>>> think
>>>>>> it is default approach for AutoCloseable inheritors).
>>>>>> 
>>>>>> But next time when I started this server I got message: “Ignite node
>>>>>> crashed in the middle of checkpoint. Will restore memory state and
>>>> enforce
>>>>>> checkpoint on node start.”
>>>>>> 
>>>>>> This happens because in close() method we don’t wait checkpoint to
>>> end.
>>>> I
>>>>>> am afraid this behaviour may confuse users on the first use of the
>>>>>> product.
>>>>>> 
>>>>>> What do you think if we change Ignite.close() functioning from
>>>> stop(true)
>>>>>> to stop(false)? This will allow to wait checkpoints to finish by
>>>> default.
>>>>>> Alternatively, we may improve example to show how to shutdown server
>>>> node
>>>>>> correctly. Current PersistentStoreExample does not cover server node
>>>>>> shutdown.
>>>>>> 
>>>>>> Any concerns on close() method change?
>>>>>> 
>>>>>> Sincerely,
>>>>>> Dmitriy Pavlov
>>>>>> 
>>>>>> 
>>>>>> IgniteConfiguration cfg = new IgniteConfiguration();
>>>>>> cfg.setPersistentStoreConfiguration(new
>>> PersistentStoreConfiguration());
>>>>>> try (Ignite ignite = Ignition.start(cfg)){
>>>>>>     ignite.active(true);
>>>>>>     IgniteCache<String, String> cache = ignite.getOrCreateCache("test"
>>> );
>>>>>>     for (int i = 0; i < 1000; i++)
>>>>>>           cache.put("Key" + i, "Value" + i);
>>>>>> }
>>>>>> 
>>>>>> 
>> 
>> 
> 

Reply via email to