Re: Handling slashes in cache names

2018-01-16 Thread Yakov Zhdanov
>> How about using both escaping and a text file with the name? One can
think of the escaped name as of a kind of ID, which happens to be
human-readable when the name is in ASCII,
and as unreadable as an UUID when the name is in UTF. This way we have all
the readability in the common case (when name is all English letters and
digits), and some limited readability (via looking into text files) when
other alphabets are used.

Sounds good to me.

--Yakov


RE: Handling slashes in cache names

2018-01-16 Thread Stanislav Lukyanov
How about using both escaping and a text file with the name?
One can think of the escaped name as of a kind of ID, which happens to be 
human-readable when the name is in ASCII,
and as unreadable as an UUID when the name is in UTF.
This way we have all the readability in the common case (when name is all 
English letters and digits),
and some limited readability (via looking into text files) when other alphabets 
are used.

Thanks,
Stan

From: Pavel Tupitsyn
Sent: 16 января 2018 г. 14:01
To: dev@ignite.apache.org
Subject: Re: Handling slashes in cache names

>  folder named by ID and txt file inside should do the trick
Agree

On Tue, Jan 16, 2018 at 1:02 PM, Dmitriy Setrakyan <dsetrak...@apache.org>
wrote:

> On Mon, Jan 15, 2018 at 7:31 AM, Pavel Tupitsyn <ptupit...@apache.org>
> wrote:
>
> > > You will never ever relate smth like "fdee0456adcc" to "мои_данные".
> >
> > As a user, why do I need to understand file names in Ignite work
> directory?
> >
>
> Because it is better to have an understandable and human readable directory
> structure than not. Let's do it right.
>



Re: Handling slashes in cache names

2018-01-16 Thread Pavel Tupitsyn
>  folder named by ID and txt file inside should do the trick
Agree

On Tue, Jan 16, 2018 at 1:02 PM, Dmitriy Setrakyan 
wrote:

> On Mon, Jan 15, 2018 at 7:31 AM, Pavel Tupitsyn 
> wrote:
>
> > > You will never ever relate smth like "fdee0456adcc" to "мои_данные".
> >
> > As a user, why do I need to understand file names in Ignite work
> directory?
> >
>
> Because it is better to have an understandable and human readable directory
> structure than not. Let's do it right.
>


Re: Handling slashes in cache names

2018-01-16 Thread Dmitriy Setrakyan
On Mon, Jan 15, 2018 at 7:31 AM, Pavel Tupitsyn 
wrote:

> > You will never ever relate smth like "fdee0456adcc" to "мои_данные".
>
> As a user, why do I need to understand file names in Ignite work directory?
>

Because it is better to have an understandable and human readable directory
structure than not. Let's do it right.


Re: Handling slashes in cache names

2018-01-16 Thread Dmitriy Setrakyan
On Mon, Jan 15, 2018 at 7:11 AM, Pavel Tupitsyn 
wrote:

> > try creating a directory on all nodes
> And then a new node appears with a different kind of file system..
>

If a new node cannot create an existing cache, it should not be allowed to
start.


Re: Handling slashes in cache names

2018-01-15 Thread Yakov Zhdanov
To understand how much storage you need for cache group "X" and watch the
trends.

Anyway, folder named by ID and txt file inside should do the trick =)

--Yakov


Re: Handling slashes in cache names

2018-01-15 Thread Pavel Tupitsyn
> You will never ever relate smth like "fdee0456adcc" to "мои_данные".

As a user, why do I need to understand file names in Ignite work directory?

On Mon, Jan 15, 2018 at 6:22 PM, Yakov Zhdanov  wrote:

> >> And then a new node appears with a different kind of file system..
> This is hardly possible. And I suggest not to
>
> >> Escaping removes all limitations and does not affect usability.
> Disagree. You will never ever relate smth like "fdee0456adcc" to
> "мои_данные".
>
> Guys, I just realized that we create folder for cache group. How about we
> choose group ID for folder name and put text file cachegroup.info
> containing group name to it?
>
> --Yakov
>


Re: Handling slashes in cache names

2018-01-15 Thread Yakov Zhdanov
>> And then a new node appears with a different kind of file system..
This is hardly possible. And I suggest not to

>> Escaping removes all limitations and does not affect usability.
Disagree. You will never ever relate smth like "fdee0456adcc" to
"мои_данные".

Guys, I just realized that we create folder for cache group. How about we
choose group ID for folder name and put text file cachegroup.info
containing group name to it?

--Yakov


Re: Handling slashes in cache names

2018-01-15 Thread Pavel Tupitsyn
> try creating a directory on all nodes
And then a new node appears with a different kind of file system..

Escaping removes all limitations and does not affect usability.

Pavel

On Mon, Jan 15, 2018 at 5:47 PM, Yakov Zhdanov  wrote:

> Agree that cache names should be case insensitive - currently it seems that
> we have issues on Windows OS.
>
> As far as allowed characters - why don't we try creating a directory on all
> nodes (but calling toLower() prior to creation)? If creation succeeds
> everywhere then cache name is acceptable. New nodes should throw exception
> if folder creation is impossible.
>
> I don't like escaping since it will not add any usability for, let's say,
> Chinese or Russian names. For example, MySQL supports ASCII:
> [0-9,a-z,A-Z$_] (basic Latin letters, digits 0-9, dollar, underscore) and
> Extended: U+0080 .. U+ [1]
>
> I also would think over some intersection of allowed file name characters
> in different file systems [2]
>
> [1] https://dev.mysql.com/doc/refman/5.7/en/identifiers.html
> [2] https://en.wikipedia.org/wiki/Filename
>
> Yakov Zhdanov
>


Re: Handling slashes in cache names

2018-01-15 Thread Yakov Zhdanov
Agree that cache names should be case insensitive - currently it seems that
we have issues on Windows OS.

As far as allowed characters - why don't we try creating a directory on all
nodes (but calling toLower() prior to creation)? If creation succeeds
everywhere then cache name is acceptable. New nodes should throw exception
if folder creation is impossible.

I don't like escaping since it will not add any usability for, let's say,
Chinese or Russian names. For example, MySQL supports ASCII:
[0-9,a-z,A-Z$_] (basic Latin letters, digits 0-9, dollar, underscore) and
Extended: U+0080 .. U+ [1]

I also would think over some intersection of allowed file name characters
in different file systems [2]

[1] https://dev.mysql.com/doc/refman/5.7/en/identifiers.html
[2] https://en.wikipedia.org/wiki/Filename

Yakov Zhdanov


RE: Handling slashes in cache names

2018-01-15 Thread Stanislav Lukyanov
Let me return back to this issue.

> Well, having to support multiple cache name formats going forward will be
> difficult.
I don’t think there is a question of multiple name formats.
Let’s just say that there are issues that can be solved on the base cache level 
(e.g. making cache names always case-insensitive)
and there are issues that have to be solved by the PDS (e.g. special and 
non-ASCII symbols that we don’t want to always ban from names).
I’m not suggesting to introduce anything to PDS that will afterwards be handled 
by the base cache code. We’ll just handle some issues
first, in PDS, and other issues will be handled separately.

> My preference would be to limit to 255 characters right now
That would be good, but it doesn’t really solve the issue with the length.
Since non-ASCII characters (and non-alphanumeric ASCII) are encoded, the actual 
length of a cache’s directory name
may be greater than the name of the cache (and don’t forget the “cache-“ 
prefix).
We could come up with a “really safe” limit, but it might be too small (around 
80?), and that would be limiting the API based on a rather arbitrary
Implementation detail.

Another reason why I like to have a hash in the file name is that we might run 
into problems with
two names, one of which is an escaped version of the other, like “my/cache” and 
“my_2f_cache”.
And I guess there can be more similar collisions that we just don’t think of 
right now. Having a hash in the name
just works as a (probabilistic) failsafe for that.

Thanks,
Stan

From: Dmitriy Setrakyan
Sent: 2 января 2018 г. 16:40
To: dev@ignite.apache.org
Subject: Re: Handling slashes in cache names

On Fri, Dec 29, 2017 at 2:28 AM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> > I would surround such replacements with "_", e.g.
> "myCacheName_somesymbol_".
> Looks nice, will do.
>
> > Here I am confused. I think the cache names should be case insensitive at
> > all times. I seriously doubt enforcing this rule would cause problems. If
> > we enforce this rule at cache creation time, then we would not have to
> add
> > a hashcode at the end.
> I think I would still keep the hashcode. E.g. I’m now also truncating
> names longer than 255 chars, and the truncated names could be equal. There
> could be more edge cases, and adding an imprint of the identity might help
> to avoid them. The names are readable enough with the hashes, but scary
> enough for users not to mess with them manually – I guess that’s a good
> thing :)

Making cache names always case-insensitive sounds good, but I’d separate it
> to another JIRA issue (it has larger compatibility impact, it affects a
> different part of the code base, etc). Is it OK?
>

Well, having to support multiple cache name formats going forward will be
difficult. I would rather we finalize on it right now.  My preference would
be to limit to 255 characters right now and make cache names case
insensitive. I doubt such change would affect many users, but it would
definitely make things cleaner.

Would be nice to here what others in the community think. Vladimir O.,
Alexey G.?

D.



Re: Handling slashes in cache names

2018-01-02 Thread Dmitriy Setrakyan
On Fri, Dec 29, 2017 at 2:28 AM, Stanislav Lukyanov 
wrote:

> > I would surround such replacements with "_", e.g.
> "myCacheName_somesymbol_".
> Looks nice, will do.
>
> > Here I am confused. I think the cache names should be case insensitive at
> > all times. I seriously doubt enforcing this rule would cause problems. If
> > we enforce this rule at cache creation time, then we would not have to
> add
> > a hashcode at the end.
> I think I would still keep the hashcode. E.g. I’m now also truncating
> names longer than 255 chars, and the truncated names could be equal. There
> could be more edge cases, and adding an imprint of the identity might help
> to avoid them. The names are readable enough with the hashes, but scary
> enough for users not to mess with them manually – I guess that’s a good
> thing :)

Making cache names always case-insensitive sounds good, but I’d separate it
> to another JIRA issue (it has larger compatibility impact, it affects a
> different part of the code base, etc). Is it OK?
>

Well, having to support multiple cache name formats going forward will be
difficult. I would rather we finalize on it right now.  My preference would
be to limit to 255 characters right now and make cache names case
insensitive. I doubt such change would affect many users, but it would
definitely make things cleaner.

Would be nice to here what others in the community think. Vladimir O.,
Alexey G.?

D.


RE: Handling slashes in cache names

2017-12-29 Thread Stanislav Lukyanov
> I would surround such replacements with "_", e.g. "myCacheName_somesymbol_".
Looks nice, will do.

> Here I am confused. I think the cache names should be case insensitive at
> all times. I seriously doubt enforcing this rule would cause problems. If
> we enforce this rule at cache creation time, then we would not have to add
> a hashcode at the end.
I think I would still keep the hashcode. E.g. I’m now also truncating names 
longer than 255 chars, and the truncated names could be equal. There could be 
more edge cases, and adding an imprint of the identity might help to avoid 
them. The names are readable enough with the hashes, but scary enough for users 
not to mess with them manually – I guess that’s a good thing :)
Making cache names always case-insensitive sounds good, but I’d separate it to 
another JIRA issue (it has larger compatibility impact, it affects a different 
part of the code base, etc). Is it OK?

Thanks,
Stan

From: Dmitriy Setrakyan
Sent: 28 декабря 2017 г. 22:33
To: dev@ignite.apache.org
Subject: Re: Handling slashes in cache names

On Thu, Dec 28, 2017 at 9:22 AM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> Hi all ,
>
> I’ve implemented an approach of encoding unsafe characters in the cache
> names for persistent storage directories. You can find it at
> https://github.com/gridgain/apache-ignite/tree/ignite-7264.
> How it works now is: 1) all characters outside of the [a-zA-Z0-9_-] class
> are replaced with their hex value (seems to be the easiest way);


I would surround such replacements with "_", e.g. "myCacheName_somesymbol_".


> 2) a hash of the cache name is added at the end of the name to avoid
> case-insensitive collisions.
> There is still a tiny chance of hitting two cache names that are equal
> ignoring case which also have the same hash, but that’s really unlikely.
>

Here I am confused. I think the cache names should be case insensitive at
all times. I seriously doubt enforcing this rule would cause problems. If
we enforce this rule at cache creation time, then we would not have to add
a hashcode at the end.


>
> It seems that there are no complications with this approach.
> The cache name to directory mapping is like
>   mycache -> cache-mycache-f19fd83d
>   my/cool/cache -> cache-my2fcool2fcache
>

As mentioned above, I would prefer "cache-my_2f_cool_2f_cache"


>   my!@#$%^&()cache -> cache-my21402324255e262829cache-84ba3e99
>
> Turns out the persistence is not the only place that doesn’t like special
> symbols in cache names – I also got an exception from MBean registration
> when creating a cache with ‘*’ or ‘?’. Filed https://issues.apache.org/
> jira/browse/IGNITE-7334 for that.
>
> Please let me know if you have any comments.
>
> Thanks,
> Stan
>
> From: Stanislav Lukyanov
> Sent: 25 декабря 2017 г. 18:09
> To: dev@ignite.apache.org
> Subject: Handling slashes in cache names
>
> Hi all,
>
> I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and I
> need some guidance on what’s the best way to approach it.
>
> The problem is that cache names are not restricted, but if persistence is
> enabled the cache needs to have a corresponding directory on the file
> system (“cache-…”) which can’t be created if the cache name contains
> certain characters (or a reserved system name).
>
> A straightforward approach would be to check if a cache name is allowed on
> the local system (e.g. via `Paths.get(name)`) and fail to create cache if
> it isn’t, but I’m a bit concerned with the consistency of the behavior (the
> same cache name be allowed on one system and not on another).
> I think a better way would be to replace special characters (say, all
> non-alphanumeric characters) with underscores in file names (not changing
> the cache configuration). Would this be OK? Are there any risks I’m not
> considering?
>
> WDYT?
>
> Thanks,
> Stan
>
>



Re: Handling slashes in cache names

2017-12-28 Thread Dmitriy Setrakyan
On Thu, Dec 28, 2017 at 9:22 AM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> Hi all ,
>
> I’ve implemented an approach of encoding unsafe characters in the cache
> names for persistent storage directories. You can find it at
> https://github.com/gridgain/apache-ignite/tree/ignite-7264.
> How it works now is: 1) all characters outside of the [a-zA-Z0-9_-] class
> are replaced with their hex value (seems to be the easiest way);


I would surround such replacements with "_", e.g. "myCacheName_somesymbol_".


> 2) a hash of the cache name is added at the end of the name to avoid
> case-insensitive collisions.
> There is still a tiny chance of hitting two cache names that are equal
> ignoring case which also have the same hash, but that’s really unlikely.
>

Here I am confused. I think the cache names should be case insensitive at
all times. I seriously doubt enforcing this rule would cause problems. If
we enforce this rule at cache creation time, then we would not have to add
a hashcode at the end.


>
> It seems that there are no complications with this approach.
> The cache name to directory mapping is like
>   mycache -> cache-mycache-f19fd83d
>   my/cool/cache -> cache-my2fcool2fcache
>

As mentioned above, I would prefer "cache-my_2f_cool_2f_cache"


>   my!@#$%^&()cache -> cache-my21402324255e262829cache-84ba3e99
>
> Turns out the persistence is not the only place that doesn’t like special
> symbols in cache names – I also got an exception from MBean registration
> when creating a cache with ‘*’ or ‘?’. Filed https://issues.apache.org/
> jira/browse/IGNITE-7334 for that.
>
> Please let me know if you have any comments.
>
> Thanks,
> Stan
>
> From: Stanislav Lukyanov
> Sent: 25 декабря 2017 г. 18:09
> To: dev@ignite.apache.org
> Subject: Handling slashes in cache names
>
> Hi all,
>
> I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and I
> need some guidance on what’s the best way to approach it.
>
> The problem is that cache names are not restricted, but if persistence is
> enabled the cache needs to have a corresponding directory on the file
> system (“cache-…”) which can’t be created if the cache name contains
> certain characters (or a reserved system name).
>
> A straightforward approach would be to check if a cache name is allowed on
> the local system (e.g. via `Paths.get(name)`) and fail to create cache if
> it isn’t, but I’m a bit concerned with the consistency of the behavior (the
> same cache name be allowed on one system and not on another).
> I think a better way would be to replace special characters (say, all
> non-alphanumeric characters) with underscores in file names (not changing
> the cache configuration). Would this be OK? Are there any risks I’m not
> considering?
>
> WDYT?
>
> Thanks,
> Stan
>
>


RE: Handling slashes in cache names

2017-12-28 Thread Stanislav Lukyanov
Hi all ,

I’ve implemented an approach of encoding unsafe characters in the cache names 
for persistent storage directories. You can find it at 
https://github.com/gridgain/apache-ignite/tree/ignite-7264.
How it works now is: 1) all characters outside of the [a-zA-Z0-9_-] class are 
replaced with their hex value (seems to be the easiest way); 2) a hash of the 
cache name is added at the end of the name to avoid case-insensitive collisions.
There is still a tiny chance of hitting two cache names that are equal ignoring 
case which also have the same hash, but that’s really unlikely.

It seems that there are no complications with this approach.
The cache name to directory mapping is like
  mycache -> cache-mycache-f19fd83d
  my/cool/cache -> cache-my2fcool2fcache
  my!@#$%^&()cache -> cache-my21402324255e262829cache-84ba3e99

Turns out the persistence is not the only place that doesn’t like special 
symbols in cache names – I also got an exception from MBean registration when 
creating a cache with ‘*’ or ‘?’. Filed 
https://issues.apache.org/jira/browse/IGNITE-7334 for that.

Please let me know if you have any comments.

Thanks,
Stan

From: Stanislav Lukyanov
Sent: 25 декабря 2017 г. 18:09
To: dev@ignite.apache.org 
Subject: Handling slashes in cache names

Hi all,

I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and I need 
some guidance on what’s the best way to approach it.

The problem is that cache names are not restricted, but if persistence is 
enabled the cache needs to have a corresponding directory on the file system 
(“cache-…”) which can’t be created if the cache name contains certain 
characters (or a reserved system name).

A straightforward approach would be to check if a cache name is allowed on the 
local system (e.g. via `Paths.get(name)`) and fail to create cache if it isn’t, 
but I’m a bit concerned with the consistency of the behavior (the same cache 
name be allowed on one system and not on another).
I think a better way would be to replace special characters (say, all 
non-alphanumeric characters) with underscores in file names (not changing the 
cache configuration). Would this be OK? Are there any risks I’m not considering?

WDYT?

Thanks,
Stan



Re: Handling slashes in cache names

2017-12-27 Thread Dmitriy Setrakyan
On Wed, Dec 27, 2017 at 8:05 AM, Pavel Tupitsyn 
wrote:

> Yep, base64 is just an example.
> We need some kind of urlencode, but tailored for file names, so that
> names remain readable.
>
> To avoid uppercase/lowercase collisions on Windows, we can restrict allowed
> characters to lowercase English letters and numbers, - and _, and escape
> everything
> else in some way.
>

I think that we should allow users to specify any case they like, but
internally we should always convert to upper or lower case, whichever one
we choose.


Re: Handling slashes in cache names

2017-12-27 Thread Sergey Kozlov
Igniters

Use cache name for file and directory names on a file system is bad idea.
In that case we should keep in mind many limitiations vary FS.
Why do not use mapping cache name to an identifier tolerated to FS lacks?

On Wed, Dec 27, 2017 at 7:05 PM, Pavel Tupitsyn 
wrote:

> Yep, base64 is just an example.
> We need some kind of urlencode, but tailored for file names, so that
> names remain readable.
>
> To avoid uppercase/lowercase collisions on Windows, we can restrict allowed
> characters
> to lowercase English letters and numbers, - and _, and escape everything
> else in some way.
>
> On Wed, Dec 27, 2017 at 5:36 PM, Dmitriy Setrakyan 
> wrote:
>
> > On Wed, Dec 27, 2017 at 6:25 AM, Vladimir Ozerov 
> > wrote:
> >
> > > Having different policies for persistent and non-persistent caches
> sounds
> > > like a bad idea for me, because there could be troubles should user try
> > to
> > > switch to persistent mode. It would require code changes.
> > >
> > > Can we just escape all non-latin symbols (e.g. using base64), while
> > leaving
> > > the rest as is? With this approach in most cases cache name will remain
> > the
> > > same, and only multibyte characters would be affected.
> > >
> >
> > Agree, if we can keep cache names in human readable form. Would be nice
> to
> > see some examples.
> >
>



-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com


Re: Handling slashes in cache names

2017-12-27 Thread Pavel Tupitsyn
Yep, base64 is just an example.
We need some kind of urlencode, but tailored for file names, so that
names remain readable.

To avoid uppercase/lowercase collisions on Windows, we can restrict allowed
characters
to lowercase English letters and numbers, - and _, and escape everything
else in some way.

On Wed, Dec 27, 2017 at 5:36 PM, Dmitriy Setrakyan 
wrote:

> On Wed, Dec 27, 2017 at 6:25 AM, Vladimir Ozerov 
> wrote:
>
> > Having different policies for persistent and non-persistent caches sounds
> > like a bad idea for me, because there could be troubles should user try
> to
> > switch to persistent mode. It would require code changes.
> >
> > Can we just escape all non-latin symbols (e.g. using base64), while
> leaving
> > the rest as is? With this approach in most cases cache name will remain
> the
> > same, and only multibyte characters would be affected.
> >
>
> Agree, if we can keep cache names in human readable form. Would be nice to
> see some examples.
>


Re: Handling slashes in cache names

2017-12-27 Thread Dmitriy Setrakyan
On Wed, Dec 27, 2017 at 6:25 AM, Vladimir Ozerov 
wrote:

> Having different policies for persistent and non-persistent caches sounds
> like a bad idea for me, because there could be troubles should user try to
> switch to persistent mode. It would require code changes.
>
> Can we just escape all non-latin symbols (e.g. using base64), while leaving
> the rest as is? With this approach in most cases cache name will remain the
> same, and only multibyte characters would be affected.
>

Agree, if we can keep cache names in human readable form. Would be nice to
see some examples.


Re: Handling slashes in cache names

2017-12-27 Thread Vladimir Ozerov
Having different policies for persistent and non-persistent caches sounds
like a bad idea for me, because there could be troubles should user try to
switch to persistent mode. It would require code changes.

Can we just escape all non-latin symbols (e.g. using base64), while leaving
the rest as is? With this approach in most cases cache name will remain the
same, and only multibyte characters would be affected.

On Wed, Dec 27, 2017 at 5:15 PM, Dmitriy Setrakyan 
wrote:

> On Wed, Dec 27, 2017 at 3:42 AM, Pavel Tupitsyn 
> wrote:
>
> > Agree with Stan and Vladimir.
> > We should not impose any restrictions on cache names, some users may have
> > issues with that.
> >
> > Using cache names as file names is internal implementation detail.
> > We can use cache id or some kind of encoding (base64, etc) to avoid file
> > system issues.
> >
> >
> Pavel, I disagree. I want to look at the file system and be able to clearly
> tell which folder belongs to which cache. If you use encryption or some
> other encoding, this would be impossible.
>
> I doubt that introducing cache name validation for *persistent* caches
> would affect any existing users. It sounds like for non-persistent caches
> the validation is not needed, right?
>
> D.
>


Re: Handling slashes in cache names

2017-12-27 Thread Dmitriy Setrakyan
On Wed, Dec 27, 2017 at 3:42 AM, Pavel Tupitsyn 
wrote:

> Agree with Stan and Vladimir.
> We should not impose any restrictions on cache names, some users may have
> issues with that.
>
> Using cache names as file names is internal implementation detail.
> We can use cache id or some kind of encoding (base64, etc) to avoid file
> system issues.
>
>
Pavel, I disagree. I want to look at the file system and be able to clearly
tell which folder belongs to which cache. If you use encryption or some
other encoding, this would be impossible.

I doubt that introducing cache name validation for *persistent* caches
would affect any existing users. It sounds like for non-persistent caches
the validation is not needed, right?

D.


Re: Handling slashes in cache names

2017-12-27 Thread Igor Sapego
Also, considering case-insensitivity issue, we need to choose
some encoding that only uses upper or lower case letters in
encoding result.

By the way, such encoding will resolve cache name clashes
due to case-insensitivity issue.

Best Regards,
Igor

On Wed, Dec 27, 2017 at 4:18 PM, Igor Sapego <isap...@apache.org> wrote:

> I personally like a Pavel's suggestion - base64 encoding seems like
> a good solution, while string hashes will arise a collision issue.
>
> Best Regards,
> Igor
>
> On Wed, Dec 27, 2017 at 3:29 PM, Petr Ivanov <mr.wei...@gmail.com> wrote:
>
>> Special characters banning seems to be exclusive way and cannot be
>> controlled in future if new symbols arise.
>> Maybe better solution will be choosing the array of permitted symbols for
>> caches names (i.e. [a-zA-Z0-9_-])?
>>
>>
>> Also +1 for using abstract hash string for directories names.
>>
>>
>> > On 27 Dec 2017, at 15:14, Stanislav Lukyanov <stanlukya...@gmail.com>
>> wrote:
>> >
>> > We can – by mapping a cache name to some (safe) string to be used as a
>> directory name, say via Base64 as Pavel has suggested.
>> >
>> > However, I think that banning certain characters might be reasonable.
>> > Some characters might be considered reserved (e.g. slashes, colon,
>> asterisk, etc) to be used later, in case some future feature requires cache
>> names to have an actual meaning.
>> > Some characters might be banned just as a precaution (e.g. control
>> characters or whitespaces) because they might cause problems with logging
>> or elsewhere (you might have a bad time processing a cache name with \0 in
>> it :) ).
>> >
>> > The question is whether or not these considerations worth adding code
>> and/or changing existing behavior.
>> >
>> > BTW Java folks had similar discussion on Java module names resulting in
>> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/201
>> 6-December/000515.html.
>> >
>> > Thanks,
>> > Stan
>> >
>> > From: Vladimir Ozerov
>> > Sent: 27 декабря 2017 г. 14:37
>> > To: dev@ignite.apache.org
>> > Subject: Re: Handling slashes in cache names
>> >
>> > Cache name appears to me purely logical entity. Can we simply store
>> cache
>> > ID in file system paths without adding any restrictions to cache names?
>> >
>> > On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <
>> stanlukya...@gmail.com>
>> > wrote:
>> >
>> >> Well, that’s my question too :)
>> >> Do we have any compatibility guidelines or other documents on what can
>> or
>> >> cannot be in a minor/major release?
>> >>
>> >> Also, it might be helpful to add an environment variable (like
>> >> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior,
>> just
>> >> in case.
>> >>
>> >> Thanks,
>> >> Stan
>> >>
>> >> From: Dmitriy Setrakyan
>> >> Sent: 26 декабря 2017 г. 17:02
>> >> To: dev@ignite.apache.org
>> >> Subject: Re: Handling slashes in cache names
>> >>
>> >> Looks good to me. Is this going to be an exception on startup? If yes,
>> is
>> >> it safe to release it, or should we wait till 3.0?
>> >>
>> >> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
>> >> stanlukya...@gmail.com>
>> >> wrote:
>> >>
>> >>> Thanks for the feedback.
>> >>>
>> >>> It seems that another thing to handle is case-insensitive FS –
>> “mycache”
>> >>> and “MyCache” is the same on Windows, so it might be reasonable to
>> >> disallow
>> >>> having two caches with names that are equal ignoring case.
>> >>> And one more thing is control characters – forbidding at least range
>> of
>> >>> ASCII 0x00-0x20 seems reasonable.
>> >>>
>> >>> To summarize, a possible set of restrictions would be
>> >>> - Whitespace characters (via Character.isWhitespaceCharacter)
>> >>> - Control characters (via Character.isISOCharacter)
>> >>> - Slashes
>> >>> - Characters reserved in Windows (<>:"/\|?*)
>> >>> - Length (say, up to 255)
>> >>> - Distinct names of caches when ignoring case
>> >>> It seems reasonable to enforce that even regardless of persistence
>> >>> directories naming (AFAIU tha

Re: Handling slashes in cache names

2017-12-27 Thread Igor Sapego
I personally like a Pavel's suggestion - base64 encoding seems like
a good solution, while string hashes will arise a collision issue.

Best Regards,
Igor

On Wed, Dec 27, 2017 at 3:29 PM, Petr Ivanov <mr.wei...@gmail.com> wrote:

> Special characters banning seems to be exclusive way and cannot be
> controlled in future if new symbols arise.
> Maybe better solution will be choosing the array of permitted symbols for
> caches names (i.e. [a-zA-Z0-9_-])?
>
>
> Also +1 for using abstract hash string for directories names.
>
>
> > On 27 Dec 2017, at 15:14, Stanislav Lukyanov <stanlukya...@gmail.com>
> wrote:
> >
> > We can – by mapping a cache name to some (safe) string to be used as a
> directory name, say via Base64 as Pavel has suggested.
> >
> > However, I think that banning certain characters might be reasonable.
> > Some characters might be considered reserved (e.g. slashes, colon,
> asterisk, etc) to be used later, in case some future feature requires cache
> names to have an actual meaning.
> > Some characters might be banned just as a precaution (e.g. control
> characters or whitespaces) because they might cause problems with logging
> or elsewhere (you might have a bad time processing a cache name with \0 in
> it :) ).
> >
> > The question is whether or not these considerations worth adding code
> and/or changing existing behavior.
> >
> > BTW Java folks had similar discussion on Java module names resulting in
> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/
> 2016-December/000515.html.
> >
> > Thanks,
> > Stan
> >
> > From: Vladimir Ozerov
> > Sent: 27 декабря 2017 г. 14:37
> > To: dev@ignite.apache.org
> > Subject: Re: Handling slashes in cache names
> >
> > Cache name appears to me purely logical entity. Can we simply store cache
> > ID in file system paths without adding any restrictions to cache names?
> >
> > On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> > wrote:
> >
> >> Well, that’s my question too :)
> >> Do we have any compatibility guidelines or other documents on what can
> or
> >> cannot be in a minor/major release?
> >>
> >> Also, it might be helpful to add an environment variable (like
> >> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior,
> just
> >> in case.
> >>
> >> Thanks,
> >> Stan
> >>
> >> From: Dmitriy Setrakyan
> >> Sent: 26 декабря 2017 г. 17:02
> >> To: dev@ignite.apache.org
> >> Subject: Re: Handling slashes in cache names
> >>
> >> Looks good to me. Is this going to be an exception on startup? If yes,
> is
> >> it safe to release it, or should we wait till 3.0?
> >>
> >> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
> >> stanlukya...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the feedback.
> >>>
> >>> It seems that another thing to handle is case-insensitive FS –
> “mycache”
> >>> and “MyCache” is the same on Windows, so it might be reasonable to
> >> disallow
> >>> having two caches with names that are equal ignoring case.
> >>> And one more thing is control characters – forbidding at least range of
> >>> ASCII 0x00-0x20 seems reasonable.
> >>>
> >>> To summarize, a possible set of restrictions would be
> >>> - Whitespace characters (via Character.isWhitespaceCharacter)
> >>> - Control characters (via Character.isISOCharacter)
> >>> - Slashes
> >>> - Characters reserved in Windows (<>:"/\|?*)
> >>> - Length (say, up to 255)
> >>> - Distinct names of caches when ignoring case
> >>> It seems reasonable to enforce that even regardless of persistence
> >>> directories naming (AFAIU that’s what Dmitry meant by forbidding things
> >>> altogether), so that’s what I’m going to do.
> >>> Any concerns?
> >>> Specifically, would it be OK from backward compatibility point of view
> to
> >>> forbid all these characters now for all caches?
> >>>
> >>> Thanks,
> >>> Stan
> >>>
> >>>
> >>> From: Alexey Kuznetsov
> >>> Sent: 26 декабря 2017 г. 7:51
> >>> To: dev@ignite.apache.org
> >>> Subject: Re: Handling slashes in cache names
> >>>
> >>> It also make sense to limit cache name length to reasonable length.
> >>> Because some File systems coul

Re: Handling slashes in cache names

2017-12-27 Thread Petr Ivanov
Special characters banning seems to be exclusive way and cannot be controlled 
in future if new symbols arise.
Maybe better solution will be choosing the array of permitted symbols for 
caches names (i.e. [a-zA-Z0-9_-])?


Also +1 for using abstract hash string for directories names.


> On 27 Dec 2017, at 15:14, Stanislav Lukyanov <stanlukya...@gmail.com> wrote:
> 
> We can – by mapping a cache name to some (safe) string to be used as a 
> directory name, say via Base64 as Pavel has suggested.
> 
> However, I think that banning certain characters might be reasonable.
> Some characters might be considered reserved (e.g. slashes, colon, asterisk, 
> etc) to be used later, in case some future feature requires cache names to 
> have an actual meaning.
> Some characters might be banned just as a precaution (e.g. control characters 
> or whitespaces) because they might cause problems with logging or elsewhere 
> (you might have a bad time processing a cache name with \0 in it :) ).
> 
> The question is whether or not these considerations worth adding code and/or 
> changing existing behavior.
> 
> BTW Java folks had similar discussion on Java module names resulting in 
> http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-December/000515.html.
> 
> Thanks,
> Stan
> 
> From: Vladimir Ozerov
> Sent: 27 декабря 2017 г. 14:37
> To: dev@ignite.apache.org
> Subject: Re: Handling slashes in cache names
> 
> Cache name appears to me purely logical entity. Can we simply store cache
> ID in file system paths without adding any restrictions to cache names?
> 
> On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <stanlukya...@gmail.com>
> wrote:
> 
>> Well, that’s my question too :)
>> Do we have any compatibility guidelines or other documents on what can or
>> cannot be in a minor/major release?
>> 
>> Also, it might be helpful to add an environment variable (like
>> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, just
>> in case.
>> 
>> Thanks,
>> Stan
>> 
>> From: Dmitriy Setrakyan
>> Sent: 26 декабря 2017 г. 17:02
>> To: dev@ignite.apache.org
>> Subject: Re: Handling slashes in cache names
>> 
>> Looks good to me. Is this going to be an exception on startup? If yes, is
>> it safe to release it, or should we wait till 3.0?
>> 
>> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
>> stanlukya...@gmail.com>
>> wrote:
>> 
>>> Thanks for the feedback.
>>> 
>>> It seems that another thing to handle is case-insensitive FS – “mycache”
>>> and “MyCache” is the same on Windows, so it might be reasonable to
>> disallow
>>> having two caches with names that are equal ignoring case.
>>> And one more thing is control characters – forbidding at least range of
>>> ASCII 0x00-0x20 seems reasonable.
>>> 
>>> To summarize, a possible set of restrictions would be
>>> - Whitespace characters (via Character.isWhitespaceCharacter)
>>> - Control characters (via Character.isISOCharacter)
>>> - Slashes
>>> - Characters reserved in Windows (<>:"/\|?*)
>>> - Length (say, up to 255)
>>> - Distinct names of caches when ignoring case
>>> It seems reasonable to enforce that even regardless of persistence
>>> directories naming (AFAIU that’s what Dmitry meant by forbidding things
>>> altogether), so that’s what I’m going to do.
>>> Any concerns?
>>> Specifically, would it be OK from backward compatibility point of view to
>>> forbid all these characters now for all caches?
>>> 
>>> Thanks,
>>> Stan
>>> 
>>> 
>>> From: Alexey Kuznetsov
>>> Sent: 26 декабря 2017 г. 7:51
>>> To: dev@ignite.apache.org
>>> Subject: Re: Handling slashes in cache names
>>> 
>>> It also make sense to limit cache name length to reasonable length.
>>> Because some File systems could have limitations on path length.
>>> See: https://en.wikipedia.org/wiki/Filename#Length_restrictions
>>> 
>>> On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <
>> dsetrak...@apache.org>
>>> wrote:
>>> 
>>>> My preference would be to prohibit forward and backward slashes in
>> cache
>>>> names altogether, as they may create a false feeling of some directory
>>>> structure, which does not exist. We should also prohibit spaces as
>> well.
>>>> 
>>>> D.
>>>> 
>>>> On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
>>>> 

RE: Handling slashes in cache names

2017-12-27 Thread Stanislav Lukyanov
We can – by mapping a cache name to some (safe) string to be used as a 
directory name, say via Base64 as Pavel has suggested.

However, I think that banning certain characters might be reasonable.
Some characters might be considered reserved (e.g. slashes, colon, asterisk, 
etc) to be used later, in case some future feature requires cache names to have 
an actual meaning.
Some characters might be banned just as a precaution (e.g. control characters 
or whitespaces) because they might cause problems with logging or elsewhere 
(you might have a bad time processing a cache name with \0 in it :) ).

The question is whether or not these considerations worth adding code and/or 
changing existing behavior.

BTW Java folks had similar discussion on Java module names resulting in 
http://mail.openjdk.java.net/pipermail/jpms-spec-experts/2016-December/000515.html.

Thanks,
Stan

From: Vladimir Ozerov
Sent: 27 декабря 2017 г. 14:37
To: dev@ignite.apache.org
Subject: Re: Handling slashes in cache names

Cache name appears to me purely logical entity. Can we simply store cache
ID in file system paths without adding any restrictions to cache names?

On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> Well, that’s my question too :)
> Do we have any compatibility guidelines or other documents on what can or
> cannot be in a minor/major release?
>
> Also, it might be helpful to add an environment variable (like
> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, just
> in case.
>
> Thanks,
> Stan
>
> From: Dmitriy Setrakyan
> Sent: 26 декабря 2017 г. 17:02
> To: dev@ignite.apache.org
> Subject: Re: Handling slashes in cache names
>
> Looks good to me. Is this going to be an exception on startup? If yes, is
> it safe to release it, or should we wait till 3.0?
>
> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> wrote:
>
> > Thanks for the feedback.
> >
> > It seems that another thing to handle is case-insensitive FS – “mycache”
> > and “MyCache” is the same on Windows, so it might be reasonable to
> disallow
> > having two caches with names that are equal ignoring case.
> > And one more thing is control characters – forbidding at least range of
> > ASCII 0x00-0x20 seems reasonable.
> >
> > To summarize, a possible set of restrictions would be
> > - Whitespace characters (via Character.isWhitespaceCharacter)
> > - Control characters (via Character.isISOCharacter)
> > - Slashes
> > - Characters reserved in Windows (<>:"/\|?*)
> > - Length (say, up to 255)
> > - Distinct names of caches when ignoring case
> > It seems reasonable to enforce that even regardless of persistence
> > directories naming (AFAIU that’s what Dmitry meant by forbidding things
> > altogether), so that’s what I’m going to do.
> > Any concerns?
> > Specifically, would it be OK from backward compatibility point of view to
> > forbid all these characters now for all caches?
> >
> > Thanks,
> > Stan
> >
> >
> > From: Alexey Kuznetsov
> > Sent: 26 декабря 2017 г. 7:51
> > To: dev@ignite.apache.org
> > Subject: Re: Handling slashes in cache names
> >
> > It also make sense to limit cache name length to reasonable length.
> > Because some File systems could have limitations on path length.
> > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions
> >
> > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> > wrote:
> >
> > > My preference would be to prohibit forward and backward slashes in
> cache
> > > names altogether, as they may create a false feeling of some directory
> > > structure, which does not exist. We should also prohibit spaces as
> well.
> > >
> > > D.
> > >
> > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> > > stanlukya...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264,
> > and
> > > I
> > > > need some guidance on what’s the best way to approach it.
> > > >
> > > > The problem is that cache names are not restricted, but if
> persistence
> > is
> > > > enabled the cache needs to have a corresponding directory on the file
> > > > system (“cache-…”) which can’t be created if the cache name contains
> > > > certain characters (or a reserved system name).
> > > >
> > > > A straightforward approach would be to check if a cache name is
>

Re: Handling slashes in cache names

2017-12-27 Thread Pavel Tupitsyn
Agree with Stan and Vladimir.
We should not impose any restrictions on cache names, some users may have
issues with that.

Using cache names as file names is internal implementation detail.
We can use cache id or some kind of encoding (base64, etc) to avoid file
system issues.

Thanks,
Pavel

On Wed, Dec 27, 2017 at 2:38 PM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> That’s interesting, thanks.
> So, do you think the locale-specific file separators should be banned as
> well?
> Handling all kinds of cases like this might be complicated.
>
> I’d rather use something else if the cache name is not a valid file name,
> a hash of the cache name.
> This way all corner cases can be handled at once.
> The algorithm would be
> 1) Check that cache name doesn’t contain banned characters
> 2) Try to create a Path for “cache-”
> 3) If failed, create a Path for “cache-”
>
> Stan
>
> From: Igor Sapego
> Sent: 26 декабря 2017 г. 17:59
> To: dev@ignite.apache.org
> Subject: Re: Handling slashes in cache names
>
> There are also some international features that you might want to
> address. For example, instead of backslash some other characters
> may be used on Windows - ¥ on the Japanese version, ₩ on the
> Korean version.
> See [1] for more info.
>
> Here is the citation:
> Security Considerations for Character Sets in File Names
> Windows code page and OEM character sets used on
> Japanese-language systems contain the Yen symbol (¥) instead of
> a backslash (\). Thus, the Yen character is a prohibited character for
> NTFS and FAT file systems. When mapping Unicode to
> a Japanese-language code page, conversion functions map both
> backslash (U+005C) and the normal Unicode Yen symbol (U+00A5)
> to this same character. For security reasons, your applications should
> not typically allow the character U+00A5 in a Unicode string that
> might be converted for use as a FAT file name.
>
> [1] - https://msdn.microsoft.com/en-us/library/dd374047(v=vs.85).aspx
>
>
> Best Regards,
> Igor
>
> On Tue, Dec 26, 2017 at 5:01 PM, Dmitriy Setrakyan <dsetrak...@apache.org>
> wrote:
>
> > Looks good to me. Is this going to be an exception on startup? If yes, is
> > it safe to release it, or should we wait till 3.0?
> >
> > On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
> > stanlukya...@gmail.com>
> > wrote:
> >
> > > Thanks for the feedback.
> > >
> > > It seems that another thing to handle is case-insensitive FS –
> “mycache”
> > > and “MyCache” is the same on Windows, so it might be reasonable to
> > disallow
> > > having two caches with names that are equal ignoring case.
> > > And one more thing is control characters – forbidding at least range of
> > > ASCII 0x00-0x20 seems reasonable.
> > >
> > > To summarize, a possible set of restrictions would be
> > > - Whitespace characters (via Character.isWhitespaceCharacter)
> > > - Control characters (via Character.isISOCharacter)
> > > - Slashes
> > > - Characters reserved in Windows (<>:"/\|?*)
> > > - Length (say, up to 255)
> > > - Distinct names of caches when ignoring case
> > > It seems reasonable to enforce that even regardless of persistence
> > > directories naming (AFAIU that’s what Dmitry meant by forbidding things
> > > altogether), so that’s what I’m going to do.
> > > Any concerns?
> > > Specifically, would it be OK from backward compatibility point of view
> to
> > > forbid all these characters now for all caches?
> > >
> > > Thanks,
> > > Stan
> > >
> > >
> > > From: Alexey Kuznetsov
> > > Sent: 26 декабря 2017 г. 7:51
> > > To: dev@ignite.apache.org
> > > Subject: Re: Handling slashes in cache names
> > >
> > > It also make sense to limit cache name length to reasonable length.
> > > Because some File systems could have limitations on path length.
> > > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions
> > >
> > > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <
> > dsetrak...@apache.org>
> > > wrote:
> > >
> > > > My preference would be to prohibit forward and backward slashes in
> > cache
> > > > names altogether, as they may create a false feeling of some
> directory
> > > > structure, which does not exist. We should also prohibit spaces as
> > well.
> > > >
> > > > D.
> > > >
> > > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> > > > stanlukya.

RE: Handling slashes in cache names

2017-12-27 Thread Stanislav Lukyanov
That’s interesting, thanks.
So, do you think the locale-specific file separators should be banned as well?
Handling all kinds of cases like this might be complicated.

I’d rather use something else if the cache name is not a valid file name, a 
hash of the cache name.
This way all corner cases can be handled at once.
The algorithm would be
1) Check that cache name doesn’t contain banned characters
2) Try to create a Path for “cache-”
3) If failed, create a Path for “cache-”

Stan

From: Igor Sapego
Sent: 26 декабря 2017 г. 17:59
To: dev@ignite.apache.org
Subject: Re: Handling slashes in cache names

There are also some international features that you might want to
address. For example, instead of backslash some other characters
may be used on Windows - ¥ on the Japanese version, ₩ on the
Korean version.
See [1] for more info.

Here is the citation:
Security Considerations for Character Sets in File Names
Windows code page and OEM character sets used on
Japanese-language systems contain the Yen symbol (¥) instead of
a backslash (\). Thus, the Yen character is a prohibited character for
NTFS and FAT file systems. When mapping Unicode to
a Japanese-language code page, conversion functions map both
backslash (U+005C) and the normal Unicode Yen symbol (U+00A5)
to this same character. For security reasons, your applications should
not typically allow the character U+00A5 in a Unicode string that
might be converted for use as a FAT file name.

[1] - https://msdn.microsoft.com/en-us/library/dd374047(v=vs.85).aspx


Best Regards,
Igor

On Tue, Dec 26, 2017 at 5:01 PM, Dmitriy Setrakyan <dsetrak...@apache.org>
wrote:

> Looks good to me. Is this going to be an exception on startup? If yes, is
> it safe to release it, or should we wait till 3.0?
>
> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> wrote:
>
> > Thanks for the feedback.
> >
> > It seems that another thing to handle is case-insensitive FS – “mycache”
> > and “MyCache” is the same on Windows, so it might be reasonable to
> disallow
> > having two caches with names that are equal ignoring case.
> > And one more thing is control characters – forbidding at least range of
> > ASCII 0x00-0x20 seems reasonable.
> >
> > To summarize, a possible set of restrictions would be
> > - Whitespace characters (via Character.isWhitespaceCharacter)
> > - Control characters (via Character.isISOCharacter)
> > - Slashes
> > - Characters reserved in Windows (<>:"/\|?*)
> > - Length (say, up to 255)
> > - Distinct names of caches when ignoring case
> > It seems reasonable to enforce that even regardless of persistence
> > directories naming (AFAIU that’s what Dmitry meant by forbidding things
> > altogether), so that’s what I’m going to do.
> > Any concerns?
> > Specifically, would it be OK from backward compatibility point of view to
> > forbid all these characters now for all caches?
> >
> > Thanks,
> > Stan
> >
> >
> > From: Alexey Kuznetsov
> > Sent: 26 декабря 2017 г. 7:51
> > To: dev@ignite.apache.org
> > Subject: Re: Handling slashes in cache names
> >
> > It also make sense to limit cache name length to reasonable length.
> > Because some File systems could have limitations on path length.
> > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions
> >
> > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> > wrote:
> >
> > > My preference would be to prohibit forward and backward slashes in
> cache
> > > names altogether, as they may create a false feeling of some directory
> > > structure, which does not exist. We should also prohibit spaces as
> well.
> > >
> > > D.
> > >
> > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> > > stanlukya...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264,
> > and
> > > I
> > > > need some guidance on what’s the best way to approach it.
> > > >
> > > > The problem is that cache names are not restricted, but if
> persistence
> > is
> > > > enabled the cache needs to have a corresponding directory on the file
> > > > system (“cache-…”) which can’t be created if the cache name contains
> > > > certain characters (or a reserved system name).
> > > >
> > > > A straightforward approach would be to check if a cache name is
> allowed
> > > on
> > > > the local system (e.g. via `Paths.get(name)`) and fail to create
> cache
> > if
> > > > it isn’t, but I’m a bit concerned with the consistency of the
> behavior
> > > (the
> > > > same cache name be allowed on one system and not on another).
> > > > I think a better way would be to replace special characters (say, all
> > > > non-alphanumeric characters) with underscores in file names (not
> > changing
> > > > the cache configuration). Would this be OK? Are there any risks I’m
> not
> > > > considering?
> > > >
> > > > WDYT?
> > > >
> > > > Thanks,
> > > > Stan
> > > >
> > >
> >
> >
> >
> > --
> > Alexey Kuznetsov
> >
> >
>



Re: Handling slashes in cache names

2017-12-27 Thread Vladimir Ozerov
Cache name appears to me purely logical entity. Can we simply store cache
ID in file system paths without adding any restrictions to cache names?

On Wed, Dec 27, 2017 at 2:26 PM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> Well, that’s my question too :)
> Do we have any compatibility guidelines or other documents on what can or
> cannot be in a minor/major release?
>
> Also, it might be helpful to add an environment variable (like
> IGNITE_DISABLE_CACHE_NAME_RESTRICTIONS) to restore the old behavior, just
> in case.
>
> Thanks,
> Stan
>
> From: Dmitriy Setrakyan
> Sent: 26 декабря 2017 г. 17:02
> To: dev@ignite.apache.org
> Subject: Re: Handling slashes in cache names
>
> Looks good to me. Is this going to be an exception on startup? If yes, is
> it safe to release it, or should we wait till 3.0?
>
> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> wrote:
>
> > Thanks for the feedback.
> >
> > It seems that another thing to handle is case-insensitive FS – “mycache”
> > and “MyCache” is the same on Windows, so it might be reasonable to
> disallow
> > having two caches with names that are equal ignoring case.
> > And one more thing is control characters – forbidding at least range of
> > ASCII 0x00-0x20 seems reasonable.
> >
> > To summarize, a possible set of restrictions would be
> > - Whitespace characters (via Character.isWhitespaceCharacter)
> > - Control characters (via Character.isISOCharacter)
> > - Slashes
> > - Characters reserved in Windows (<>:"/\|?*)
> > - Length (say, up to 255)
> > - Distinct names of caches when ignoring case
> > It seems reasonable to enforce that even regardless of persistence
> > directories naming (AFAIU that’s what Dmitry meant by forbidding things
> > altogether), so that’s what I’m going to do.
> > Any concerns?
> > Specifically, would it be OK from backward compatibility point of view to
> > forbid all these characters now for all caches?
> >
> > Thanks,
> > Stan
> >
> >
> > From: Alexey Kuznetsov
> > Sent: 26 декабря 2017 г. 7:51
> > To: dev@ignite.apache.org
> > Subject: Re: Handling slashes in cache names
> >
> > It also make sense to limit cache name length to reasonable length.
> > Because some File systems could have limitations on path length.
> > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions
> >
> > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> > wrote:
> >
> > > My preference would be to prohibit forward and backward slashes in
> cache
> > > names altogether, as they may create a false feeling of some directory
> > > structure, which does not exist. We should also prohibit spaces as
> well.
> > >
> > > D.
> > >
> > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> > > stanlukya...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264,
> > and
> > > I
> > > > need some guidance on what’s the best way to approach it.
> > > >
> > > > The problem is that cache names are not restricted, but if
> persistence
> > is
> > > > enabled the cache needs to have a corresponding directory on the file
> > > > system (“cache-…”) which can’t be created if the cache name contains
> > > > certain characters (or a reserved system name).
> > > >
> > > > A straightforward approach would be to check if a cache name is
> allowed
> > > on
> > > > the local system (e.g. via `Paths.get(name)`) and fail to create
> cache
> > if
> > > > it isn’t, but I’m a bit concerned with the consistency of the
> behavior
> > > (the
> > > > same cache name be allowed on one system and not on another).
> > > > I think a better way would be to replace special characters (say, all
> > > > non-alphanumeric characters) with underscores in file names (not
> > changing
> > > > the cache configuration). Would this be OK? Are there any risks I’m
> not
> > > > considering?
> > > >
> > > > WDYT?
> > > >
> > > > Thanks,
> > > > Stan
> > > >
> > >
> >
> >
> >
> > --
> > Alexey Kuznetsov
> >
> >
>
>


Re: Handling slashes in cache names

2017-12-26 Thread Igor Sapego
There are also some international features that you might want to
address. For example, instead of backslash some other characters
may be used on Windows - ¥ on the Japanese version, ₩ on the
Korean version.
See [1] for more info.

Here is the citation:
Security Considerations for Character Sets in File Names
Windows code page and OEM character sets used on
Japanese-language systems contain the Yen symbol (¥) instead of
a backslash (\). Thus, the Yen character is a prohibited character for
NTFS and FAT file systems. When mapping Unicode to
a Japanese-language code page, conversion functions map both
backslash (U+005C) and the normal Unicode Yen symbol (U+00A5)
to this same character. For security reasons, your applications should
not typically allow the character U+00A5 in a Unicode string that
might be converted for use as a FAT file name.

[1] - https://msdn.microsoft.com/en-us/library/dd374047(v=vs.85).aspx


Best Regards,
Igor

On Tue, Dec 26, 2017 at 5:01 PM, Dmitriy Setrakyan <dsetrak...@apache.org>
wrote:

> Looks good to me. Is this going to be an exception on startup? If yes, is
> it safe to release it, or should we wait till 3.0?
>
> On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> wrote:
>
> > Thanks for the feedback.
> >
> > It seems that another thing to handle is case-insensitive FS – “mycache”
> > and “MyCache” is the same on Windows, so it might be reasonable to
> disallow
> > having two caches with names that are equal ignoring case.
> > And one more thing is control characters – forbidding at least range of
> > ASCII 0x00-0x20 seems reasonable.
> >
> > To summarize, a possible set of restrictions would be
> > - Whitespace characters (via Character.isWhitespaceCharacter)
> > - Control characters (via Character.isISOCharacter)
> > - Slashes
> > - Characters reserved in Windows (<>:"/\|?*)
> > - Length (say, up to 255)
> > - Distinct names of caches when ignoring case
> > It seems reasonable to enforce that even regardless of persistence
> > directories naming (AFAIU that’s what Dmitry meant by forbidding things
> > altogether), so that’s what I’m going to do.
> > Any concerns?
> > Specifically, would it be OK from backward compatibility point of view to
> > forbid all these characters now for all caches?
> >
> > Thanks,
> > Stan
> >
> >
> > From: Alexey Kuznetsov
> > Sent: 26 декабря 2017 г. 7:51
> > To: dev@ignite.apache.org
> > Subject: Re: Handling slashes in cache names
> >
> > It also make sense to limit cache name length to reasonable length.
> > Because some File systems could have limitations on path length.
> > See: https://en.wikipedia.org/wiki/Filename#Length_restrictions
> >
> > On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> > wrote:
> >
> > > My preference would be to prohibit forward and backward slashes in
> cache
> > > names altogether, as they may create a false feeling of some directory
> > > structure, which does not exist. We should also prohibit spaces as
> well.
> > >
> > > D.
> > >
> > > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> > > stanlukya...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264,
> > and
> > > I
> > > > need some guidance on what’s the best way to approach it.
> > > >
> > > > The problem is that cache names are not restricted, but if
> persistence
> > is
> > > > enabled the cache needs to have a corresponding directory on the file
> > > > system (“cache-…”) which can’t be created if the cache name contains
> > > > certain characters (or a reserved system name).
> > > >
> > > > A straightforward approach would be to check if a cache name is
> allowed
> > > on
> > > > the local system (e.g. via `Paths.get(name)`) and fail to create
> cache
> > if
> > > > it isn’t, but I’m a bit concerned with the consistency of the
> behavior
> > > (the
> > > > same cache name be allowed on one system and not on another).
> > > > I think a better way would be to replace special characters (say, all
> > > > non-alphanumeric characters) with underscores in file names (not
> > changing
> > > > the cache configuration). Would this be OK? Are there any risks I’m
> not
> > > > considering?
> > > >
> > > > WDYT?
> > > >
> > > > Thanks,
> > > > Stan
> > > >
> > >
> >
> >
> >
> > --
> > Alexey Kuznetsov
> >
> >
>


Re: Handling slashes in cache names

2017-12-26 Thread Dmitriy Setrakyan
Looks good to me. Is this going to be an exception on startup? If yes, is
it safe to release it, or should we wait till 3.0?

On Tue, Dec 26, 2017 at 2:08 AM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> Thanks for the feedback.
>
> It seems that another thing to handle is case-insensitive FS – “mycache”
> and “MyCache” is the same on Windows, so it might be reasonable to disallow
> having two caches with names that are equal ignoring case.
> And one more thing is control characters – forbidding at least range of
> ASCII 0x00-0x20 seems reasonable.
>
> To summarize, a possible set of restrictions would be
> - Whitespace characters (via Character.isWhitespaceCharacter)
> - Control characters (via Character.isISOCharacter)
> - Slashes
> - Characters reserved in Windows (<>:"/\|?*)
> - Length (say, up to 255)
> - Distinct names of caches when ignoring case
> It seems reasonable to enforce that even regardless of persistence
> directories naming (AFAIU that’s what Dmitry meant by forbidding things
> altogether), so that’s what I’m going to do.
> Any concerns?
> Specifically, would it be OK from backward compatibility point of view to
> forbid all these characters now for all caches?
>
> Thanks,
> Stan
>
>
> From: Alexey Kuznetsov
> Sent: 26 декабря 2017 г. 7:51
> To: dev@ignite.apache.org
> Subject: Re: Handling slashes in cache names
>
> It also make sense to limit cache name length to reasonable length.
> Because some File systems could have limitations on path length.
> See: https://en.wikipedia.org/wiki/Filename#Length_restrictions
>
> On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <dsetrak...@apache.org>
> wrote:
>
> > My preference would be to prohibit forward and backward slashes in cache
> > names altogether, as they may create a false feeling of some directory
> > structure, which does not exist. We should also prohibit spaces as well.
> >
> > D.
> >
> > On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> > stanlukya...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264,
> and
> > I
> > > need some guidance on what’s the best way to approach it.
> > >
> > > The problem is that cache names are not restricted, but if persistence
> is
> > > enabled the cache needs to have a corresponding directory on the file
> > > system (“cache-…”) which can’t be created if the cache name contains
> > > certain characters (or a reserved system name).
> > >
> > > A straightforward approach would be to check if a cache name is allowed
> > on
> > > the local system (e.g. via `Paths.get(name)`) and fail to create cache
> if
> > > it isn’t, but I’m a bit concerned with the consistency of the behavior
> > (the
> > > same cache name be allowed on one system and not on another).
> > > I think a better way would be to replace special characters (say, all
> > > non-alphanumeric characters) with underscores in file names (not
> changing
> > > the cache configuration). Would this be OK? Are there any risks I’m not
> > > considering?
> > >
> > > WDYT?
> > >
> > > Thanks,
> > > Stan
> > >
> >
>
>
>
> --
> Alexey Kuznetsov
>
>


RE: Handling slashes in cache names

2017-12-26 Thread Stanislav Lukyanov
Thanks for the feedback.

It seems that another thing to handle is case-insensitive FS – “mycache” and 
“MyCache” is the same on Windows, so it might be reasonable to disallow having 
two caches with names that are equal ignoring case.
And one more thing is control characters – forbidding at least range of ASCII 
0x00-0x20 seems reasonable.

To summarize, a possible set of restrictions would be
- Whitespace characters (via Character.isWhitespaceCharacter)
- Control characters (via Character.isISOCharacter)
- Slashes
- Characters reserved in Windows (<>:"/\|?*)
- Length (say, up to 255)
- Distinct names of caches when ignoring case
It seems reasonable to enforce that even regardless of persistence directories 
naming (AFAIU that’s what Dmitry meant by forbidding things altogether), so 
that’s what I’m going to do. 
Any concerns?
Specifically, would it be OK from backward compatibility point of view to 
forbid all these characters now for all caches?

Thanks,
Stan


From: Alexey Kuznetsov
Sent: 26 декабря 2017 г. 7:51
To: dev@ignite.apache.org
Subject: Re: Handling slashes in cache names

It also make sense to limit cache name length to reasonable length.
Because some File systems could have limitations on path length.
See: https://en.wikipedia.org/wiki/Filename#Length_restrictions

On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan <dsetrak...@apache.org>
wrote:

> My preference would be to prohibit forward and backward slashes in cache
> names altogether, as they may create a false feeling of some directory
> structure, which does not exist. We should also prohibit spaces as well.
>
> D.
>
> On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and
> I
> > need some guidance on what’s the best way to approach it.
> >
> > The problem is that cache names are not restricted, but if persistence is
> > enabled the cache needs to have a corresponding directory on the file
> > system (“cache-…”) which can’t be created if the cache name contains
> > certain characters (or a reserved system name).
> >
> > A straightforward approach would be to check if a cache name is allowed
> on
> > the local system (e.g. via `Paths.get(name)`) and fail to create cache if
> > it isn’t, but I’m a bit concerned with the consistency of the behavior
> (the
> > same cache name be allowed on one system and not on another).
> > I think a better way would be to replace special characters (say, all
> > non-alphanumeric characters) with underscores in file names (not changing
> > the cache configuration). Would this be OK? Are there any risks I’m not
> > considering?
> >
> > WDYT?
> >
> > Thanks,
> > Stan
> >
>



-- 
Alexey Kuznetsov



Re: Handling slashes in cache names

2017-12-25 Thread Alexey Kuznetsov
It also make sense to limit cache name length to reasonable length.
Because some File systems could have limitations on path length.
See: https://en.wikipedia.org/wiki/Filename#Length_restrictions

On Tue, Dec 26, 2017 at 1:41 AM, Dmitriy Setrakyan 
wrote:

> My preference would be to prohibit forward and backward slashes in cache
> names altogether, as they may create a false feeling of some directory
> structure, which does not exist. We should also prohibit spaces as well.
>
> D.
>
> On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and
> I
> > need some guidance on what’s the best way to approach it.
> >
> > The problem is that cache names are not restricted, but if persistence is
> > enabled the cache needs to have a corresponding directory on the file
> > system (“cache-…”) which can’t be created if the cache name contains
> > certain characters (or a reserved system name).
> >
> > A straightforward approach would be to check if a cache name is allowed
> on
> > the local system (e.g. via `Paths.get(name)`) and fail to create cache if
> > it isn’t, but I’m a bit concerned with the consistency of the behavior
> (the
> > same cache name be allowed on one system and not on another).
> > I think a better way would be to replace special characters (say, all
> > non-alphanumeric characters) with underscores in file names (not changing
> > the cache configuration). Would this be OK? Are there any risks I’m not
> > considering?
> >
> > WDYT?
> >
> > Thanks,
> > Stan
> >
>



-- 
Alexey Kuznetsov


Re: Handling slashes in cache names

2017-12-25 Thread Dmitriy Setrakyan
My preference would be to prohibit forward and backward slashes in cache
names altogether, as they may create a false feeling of some directory
structure, which does not exist. We should also prohibit spaces as well.

D.

On Mon, Dec 25, 2017 at 7:09 AM, Stanislav Lukyanov 
wrote:

> Hi all,
>
> I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and I
> need some guidance on what’s the best way to approach it.
>
> The problem is that cache names are not restricted, but if persistence is
> enabled the cache needs to have a corresponding directory on the file
> system (“cache-…”) which can’t be created if the cache name contains
> certain characters (or a reserved system name).
>
> A straightforward approach would be to check if a cache name is allowed on
> the local system (e.g. via `Paths.get(name)`) and fail to create cache if
> it isn’t, but I’m a bit concerned with the consistency of the behavior (the
> same cache name be allowed on one system and not on another).
> I think a better way would be to replace special characters (say, all
> non-alphanumeric characters) with underscores in file names (not changing
> the cache configuration). Would this be OK? Are there any risks I’m not
> considering?
>
> WDYT?
>
> Thanks,
> Stan
>


Handling slashes in cache names

2017-12-25 Thread Stanislav Lukyanov
Hi all,

I’m looking into https://issues.apache.org/jira/browse/IGNITE-7264, and I need 
some guidance on what’s the best way to approach it.

The problem is that cache names are not restricted, but if persistence is 
enabled the cache needs to have a corresponding directory on the file system 
(“cache-…”) which can’t be created if the cache name contains certain 
characters (or a reserved system name).

A straightforward approach would be to check if a cache name is allowed on the 
local system (e.g. via `Paths.get(name)`) and fail to create cache if it isn’t, 
but I’m a bit concerned with the consistency of the behavior (the same cache 
name be allowed on one system and not on another).
I think a better way would be to replace special characters (say, all 
non-alphanumeric characters) with underscores in file names (not changing the 
cache configuration). Would this be OK? Are there any risks I’m not considering?

WDYT?

Thanks,
Stan