Re: Does ignite suite for large data search without index?

c c Thu, 21 Nov 2019 05:10:00 -0800

HI，
We have some filter condition like this:    age >= 18 and level =1 and
gender = 1;  (age >= 18 or level = 2) and hobby = 'music'
If one cache for each column, join the result is complicate.
Is there any way make searching without index fast?


Mikael <[email protected]> 于2019年11月21日周四 下午8:08写道：

> Hi!
>
> One idea would be to have one cache for each column, so the key is name
> and value is the hobby for example, you get an index on the key for "free"
> and create one index on the value.
>
> If the cache does not contain name that person does not have a hobby, only
> names that does have a hobby is in the cache, it would complicate the query
> a bit and you need to ask multiple queries for each column, but updating of
> index is fast as you only need to update one index for each cache if you
> only update a few columns, if you need to update all it will of course
> still need to update the index for all caches, I am not sure if that would
> work for you, it depends on what kind of queries you need.
>
> In theory you could have 15 nodes and have one cache on each node and ask
> queries in parallel.
>
> I am not at all sure it will work well, it's just an idea.
>
> Mikael
>
>
> Den 2019-11-21 kl. 12:17, skrev c c:
>
> yes, we may add more columns in the future. You mean creating index create
> on one column or multiple columns? And some columns value difference are
> not big. So many index is not efficient and will cost a lot of ram and
> decrease update or insert performance(this table may udpate real time). So
> we think just traveling collection in memory is good. And cache is scalable
> will get rid of ram limit and make filter more quick.
>
> Mikael <[email protected]> 于2019年11月21日周四 下午7:06写道：
>
>> Hi!
>>
>> Are the queries limited to something like "select name from ... where
>> hobby=x and location=y..." or you need more complex queries ?
>>
>> If the columns are fixed to 15, I don't see why you could not create 15
>> indices, it would use lots of ram and I don't think it's the best solution
>> either but it should work.
>>
>> Is it fixed to 15 columns ? or will you have to add more columns in the
>> future ?
>>
>> Den 2019-11-21 kl. 10:56, skrev c c:
>>
>> HI, Mikael
>>      Thanks for you reply very much!
>>      The type of data like this:
>>      member [name, location, age, gender, hobby, level, credits, expense
>> ...]
>>      We need filter data by arbitrary fileds combination, so creating
>> index is not of much use. We thought traveling all data in memory works
>> better.
>>      We can keep all data in ram, but data may increase progressisvely,
>> single node is not scalable. So we plan to use a distribute memory cache.
>>      We store data off heap and all in ram with default ignite
>> serialization. We just create table, then populate data with default
>> configuration in ignite, query by sql(one node,  4 million records ).
>>      Is there anyway can improve query performance ?
>>
>> Mikael <[email protected]> 于2019年11月21日周四 下午5:02写道：
>>
>>> Hi!
>>>
>>> The comparison is not of much use, when you talk about ignite, it's not
>>> just to search a list, there is serialization/deserialization and other
>>> things to consider that will make it slower compared to a simple list
>>> search, a linear search on an Ignite cache depends on how you store data
>>> (off heap/on heap, in ram/partially on disk, type of serialization and
>>> so on.
>>>
>>> If you cannot keep all data in ram you are going to need some index to
>>> do a fast lookup, there is no way around it.
>>>
>>> If you can have all the data in ram, why do you need Ignite ? do you
>>> have some other requirements for it that Ignite gives you ? otherwise it
>>> might be simpler to just use a list in ram and go with that ?
>>>
>>> Is memory a limitation (cluster or single node ?) ? if not, could you
>>> explain why is it difficult to create an index on the data ?
>>>
>>> Could you explain what type of data it is ? maybe it is possible to
>>> arrange the data in some other way to improve everything
>>>
>>> Did you test with a single node or a cluster of nodes ? with more nodes
>>> you can improve performance as any search can be split up between the
>>> nodes, still, some kind of index will help a lot.
>>>
>>> Mikael
>>>
>>> Den 2019-11-21 kl. 08:49, skrev c c:
>>> > HI,
>>> >      We have a table with about 30 million records and 15 fields. We
>>> > need implement function that user can filter record by arbitrary 12
>>> > fields( one，two, three...of them) with very low latency. It's
>>> > difficult to create index. We think ignite is a grid memory cache and
>>> > test it with 4 million records(one node) without creating index. It
>>> > took about 5 seconds to find a record match one field filter
>>> > condition. We have tested just travel a java List(10 million elements)
>>> > with 3 filter condition. It took about 0.1 second. We just want to
>>> > know whether ignite suit this use case? Thanks very much.
>>> >
>>>
>>

Re: Does ignite suite for large data search without index?

Reply via email to