Re: Ignite as distributed file storage

2018-08-04 Thread Pavel Kovalenko
Dmitriy, This approach will also work with byte[] array and binary objects as well. It will be just new addition to binary object types, but behaviour will be same. What do you mean saying custom logic of splittiing blob? By default blob will be splitted to chunks of some size. This will be

Re: Ignite as distributed file storage

2018-08-03 Thread Denis Magda
Dmitriy, I would suggest us not limiting blobs use case to a dedicated cache. If to look at other databases they usually have BLOB/LONGBLOB/etc. as a type meaning that users mix simple and BLOB types in the same tables. Should we start with Ignite SQL adding blobs through its APIs? -- Denis On

Re: Ignite as distributed file storage

2018-08-03 Thread Dmitriy Setrakyan
Pavel, Not everything that gets put in Ignite has a class, and not everything can be annotated. For example, what about byte[] or a binary object? I would prefer a separate cache with specific purpose of storing blobs. It will be easier for users to understand and configure. It can also have a

Re: Ignite as distributed file storage

2018-08-03 Thread Alexey Kuznetsov
Pavel, Will this internal storage mounted on all nodes hosts? Otherwise we will have a huge traffic on rebalance of blobs. On Fri, Aug 3, 2018 at 5:21 PM Pavel Kovalenko wrote: > Dmitriy, > > I think we don't need a separate implementation of cache like BLOB cache. > Instead of it, a user can

Re: Ignite as distributed file storage

2018-08-03 Thread Pavel Kovalenko
Dmitriy, I think we don't need a separate implementation of cache like BLOB cache. Instead of it, a user can mark value class or value class field with the special annotation "@BLOB". During cache put, marshaller will place a special placeholder on such fields, write byte[] array payload of a

Re: Ignite as distributed file storage

2018-08-02 Thread Dmitriy Setrakyan
On Thu, Aug 2, 2018 at 1:08 AM, Pavel Kovalenko wrote: > Dmitriy, > > I still don't understand why do you think that it will be file system? > In all my previous messages I emphasized that this storage shouldn't be > considered as a file system. It's just a large data storage, whose entities >

Re: Ignite as distributed file storage

2018-08-02 Thread Dmitriy Pavlov
Hi Dmitriy, I appreciate members which are concentrated on code and selecting best option. But as community members we should drive community to grow accoring 'Community first' principle. And then good project and codebase will come by magic. In the same time I suggest to concentrate on

Re: Ignite as distributed file storage

2018-08-02 Thread Pavel Kovalenko
Dmitriy, I still don't understand why do you think that it will be file system? In all my previous messages I emphasized that this storage shouldn't be considered as a file system. It's just a large data storage, whose entities can be easily accessed using key/link (internally, or externally

Re: Ignite as distributed file storage

2018-08-01 Thread Dmitriy Setrakyan
Dmitriy, Pavel, Everything that gets accepted into the project has to make sense. I agree with Vladimir - we do not need more than one file system in Ignite. Given the number of usage and questions we get about IGFS, I would question whether Ignite needs a file system at all. As community

Re: Ignite as distributed file storage

2018-08-01 Thread Dmitriy Pavlov
Hi Vladimir, I think not accepting by community is possible only if PMC will veto change. I didn't find any reasons why not to do this change and why it can be vetoed.. I would appreciate if you will become mentor of this change and will assist to Pavel or other community member to make this

Re: Ignite as distributed file storage

2018-07-06 Thread Vladimir Ozerov
Pavel, I do not think it is a good idea to delay discussions and decisions. Because it puts your efforts at risk being not accepted by community in the end. Our ultimate goal is not having as much features as possible, but to have a consistent product which is easy to understand and use. Having

Re: Ignite as distributed file storage

2018-07-05 Thread Pavel Kovalenko
Vladimir, I just want to add to my words, that we can implement BLOB storage and then, if community really wants it, we can adapt this storage to use as underlying file system in IGFS. But IGFS shouldn't be entry point for BLOB storage. I think this conclusion can satisfy both of us. 2018-07-06

Re: Ignite as distributed file storage

2018-07-05 Thread Pavel Kovalenko
Vladimir, I didn't say that it stores data in on-heap, I said that it performs a lot of operations with byte[] arrays in on-heap as I see in , which will lead to frequent GCs and unnecessary data copying. "But the whole idea around mmap sounds like premature optimisation to me" - this is not

Re: Ignite as distributed file storage

2018-07-05 Thread Vladimir Ozerov
Pavel, IGFS doesn't enforce you to have block in heap. What you suggest can be achieved with IGFS as follows: 1) Disable caching, so data cache is not used ("PROXY" mode) 2) Implement IgniteFileSystem interface which operates on abstract streams But the whole idea around mmap sounds like

Re: Ignite as distributed file storage

2018-07-05 Thread Pavel Kovalenko
Vladimir, The key difference between BLOB storage and IGFS is that BLOB storage will have persistent-based architecture with possibility to cache blocks in offheap (using mmap, which is more simple, because we delegate it to OS level) , while IGFS has in-memory based architecture with possibility

Re: Ignite as distributed file storage

2018-07-05 Thread Vladimir Ozerov
Pavel, Design you described is almost precisely what IGFS does. It has a cache for metadata, split binary data in chunks with intelligent affinity routing. In addition we have map-reduce feature on top of it and integration with underlying file system with optional caching. Data can be accessed in

Re: Ignite as distributed file storage

2018-07-03 Thread Sergey Kozlov
Dmitriy You're right that that large objects storing should be optmized. Let's assume the large object means the regular object having large fields and such fileds won't be used for comparison thus we can do not restore the BLOB fields in offheap page memory e.g for sql queries if select doesn't

Re: Ignite as distributed file storage

2018-07-02 Thread Dmitriy Setrakyan
To be honest, I am not sure if we need to kick off another file system storage discussion in Ignite. It sounds like a huge effort and likely will not be productive. However, I think an ability to store large objects will make sense. For example, how do I store a 10GB blob in Ignite cache? Most

Re: Ignite as distributed file storage

2018-07-02 Thread Vladimir Ozerov
Pavel, Thank you. I'll wait for feature comparison and concrete use cases, because for me this feature still sounds too abstract to judge whether product would benefit from it. On Mon, Jul 2, 2018 at 3:15 PM Pavel Kovalenko wrote: > Dmitriy, > > I think we have a little miscommunication here.

Re: Ignite as distributed file storage

2018-07-02 Thread Pavel Kovalenko
Dmitriy, I think we have a little miscommunication here. Of course, I meant supporting large entries / chunks of binary data. Internally it will be BLOB storage, which can be accessed through various interfaces. "File" is just an abstraction for an end user for convenience, a wrapper layer to

Re: Ignite as distributed file storage

2018-07-01 Thread Dmitriy Setrakyan
Pavel, I have actually misunderstood the use case. To be honest, I thought that you were talking about the support of large values in Ignite caches, e.g. objects that are several megabytes in cache. If we are tackling the distributed file system, then in my view, we should be talking about IGFS

Re: Ignite as distributed file storage

2018-06-30 Thread Dmitry Pavlov
I defenetely support adding this functionality. As Ignite user I develop MTCGA Bot, this tool stores test results from previous TC runs. In addition to test result it also stores thread dump and, sometimes, logs. It would be very convenient and more productive to store this data in such file

Re: Ignite as distributed file storage

2018-06-30 Thread Vladimir Ozerov
Pavel, Can you provide competitive analysis with other storage solutions? What products will we compete with? What would be our advantages against them? I talked to several folks working on solutions involving video and image processing. They are rarely use any databases or grids. Neither they

Re: Ignite as distributed file storage

2018-06-30 Thread Pavel Kovalenko
Dmitriy, Yes, I have approximate design in my mind. The main idea is that we already have distributed cache for files metadata (our Atomic cache), the data flow and distribution will be controlled by our AffinityFunction and Baseline. We're already have discovery and communication to make such

Re: Ignite as distributed file storage

2018-06-30 Thread Dmitriy Setrakyan
Pavel, it definitely makes sense. Do you have a design in mind? D. On Sat, Jun 30, 2018, 07:24 Pavel Kovalenko wrote: > Igniters, > > I would like to start a discussion about designing a new feature because I > think it's time to start making steps towards it. > I noticed, that some of our

Re: Ignite as distributed file storage

2018-06-30 Thread Denis Magda
Hello Pavel, Agree that our users want to store large entries occasionally. Got several inquiries from those who are dealing with audio and video data sets. What do you think have to be changed at our memory level so that we can store such data efficiently? Denis On Saturday, June 30, 2018,

Ignite as distributed file storage

2018-06-30 Thread Pavel Kovalenko
Igniters, I would like to start a discussion about designing a new feature because I think it's time to start making steps towards it. I noticed, that some of our users have tried to store large homogenous entries (> 1, 10, 100 Mb/Gb/Tb) to our caches, but without big success. IGFS project has