Hi!
One idea would be to have one cache for each column, so the key is name
and value is the hobby for example, you get an index on the key for
"free" and create one index on the value.
If the cache does not contain name that person does not have a hobby,
only names that does have a hobby is in the cache, it would complicate
the query a bit and you need to ask multiple queries for each column,
but updating of index is fast as you only need to update one index for
each cache if you only update a few columns, if you need to update all
it will of course still need to update the index for all caches, I am
not sure if that would work for you, it depends on what kind of queries
you need.
In theory you could have 15 nodes and have one cache on each node and
ask queries in parallel.
I am not at all sure it will work well, it's just an idea.
Mikael
Den 2019-11-21 kl. 12:17, skrev c c:
yes, we may add more columns in the future. You mean creating index
create on one column or multiple columns? And some columns value
difference are not big. So many index is not efficient and will cost a
lot of ram and decrease update or insert performance(this table may
udpate real time). So we think just traveling collection in memory is
good. And cache is scalable will get rid of ram limit and make filter
more quick.
Mikael <[email protected] <mailto:[email protected]>>
于2019年11月21日周四 下午7:06写道:
Hi!
Are the queries limited to something like "select name from ...
where hobby=x and location=y..." or you need more complex queries ?
If the columns are fixed to 15, I don't see why you could not
create 15 indices, it would use lots of ram and I don't think it's
the best solution either but it should work.
Is it fixed to 15 columns ? or will you have to add more columns
in the future ?
Den 2019-11-21 kl. 10:56, skrev c c:
HI,Mikael
Thanks for you reply very much!
The type of data like this:
member [name, location, age, gender, hobby, level, credits,
expense ...]
We need filter data by arbitrary fileds combination, so
creating index is not of much use. We thought traveling all data
in memory works better.
We can keep all data in ram, but data may increase
progressisvely, single node is not scalable. So we plan to use a
distribute memory cache.
We store data off heap and all in ram with default ignite
serialization. We just create table, then populate data with
default configuration in ignite, query by sql(one node, 4
million records ).
Is there anyway can improve query performance ?
Mikael <[email protected]
<mailto:[email protected]>> 于2019年11月21日周四
下午5:02写道:
Hi!
The comparison is not of much use, when you talk about
ignite, it's not
just to search a list, there is serialization/deserialization
and other
things to consider that will make it slower compared to a
simple list
search, a linear search on an Ignite cache depends on how you
store data
(off heap/on heap, in ram/partially on disk, type of
serialization and
so on.
If you cannot keep all data in ram you are going to need some
index to
do a fast lookup, there is no way around it.
If you can have all the data in ram, why do you need Ignite ?
do you
have some other requirements for it that Ignite gives you ?
otherwise it
might be simpler to just use a list in ram and go with that ?
Is memory a limitation (cluster or single node ?) ? if not,
could you
explain why is it difficult to create an index on the data ?
Could you explain what type of data it is ? maybe it is
possible to
arrange the data in some other way to improve everything
Did you test with a single node or a cluster of nodes ? with
more nodes
you can improve performance as any search can be split up
between the
nodes, still, some kind of index will help a lot.
Mikael
Den 2019-11-21 kl. 08:49, skrev c c:
> HI,
> We have a table with about 30 million records and 15
fields. We
> need implement function that user can filter record by
arbitrary 12
> fields( one,two, three...of them) with very low latency. It's
> difficult to create index. We think ignite is a grid memory
cache and
> test it with 4 million records(one node) without creating
index. It
> took about 5 seconds to find a record match one field filter
> condition. We have tested just travel a java List(10
million elements)
> with 3 filter condition. It took about 0.1 second. We just
want to
> know whether ignite suit this use case? Thanks very much.
>