Hi!

One idea would be to have one cache for each column, so the key is name and value is the hobby for example, you get an index on the key for "free" and create one index on the value.

If the cache does not contain name that person does not have a hobby, only names that does have a hobby is in the cache, it would complicate the query a bit and you need to ask multiple queries for each column, but updating of index is fast as you only need to update one index for each cache if you only update a few columns, if you need to update all it will of course still need to update the index for all caches, I am not sure if that would work for you, it depends on what kind of queries you need.

In theory you could have 15 nodes and have one cache on each node and ask queries in parallel.

I am not at all sure it will work well, it's just an idea.

Mikael


Den 2019-11-21 kl. 12:17, skrev c c:
yes, we may add more columns in the future. You mean creating index create on one column or multiple columns? And some columns value difference are not big. So many index is not efficient and will cost a lot of ram and decrease update or insert performance(this table may udpate real time). So we think just traveling collection in memory is good. And cache is scalable will get rid of ram limit and make filter more quick.

Mikael <[email protected] <mailto:[email protected]>> 于2019年11月21日周四 下午7:06写道:

    Hi!

    Are the queries limited to something like "select name from ...
    where hobby=x and location=y..." or you need more complex queries ?

    If the columns are fixed to 15, I don't see why you could not
    create 15 indices, it would use lots of ram and I don't think it's
    the best solution either but it should work.

    Is it fixed to 15 columns ? or will you have to add more columns
    in the future ?

    Den 2019-11-21 kl. 10:56, skrev c c:

    HI,Mikael
         Thanks for you reply very much!
         The type of data like this:
         member [name, location, age, gender, hobby, level, credits,
    expense ...]
         We need filter data by arbitrary fileds combination, so
    creating index is not of much use. We thought traveling all data
    in memory works better.
         We can keep all data in ram, but data may increase
    progressisvely, single node is not scalable. So we plan to use a
    distribute memory cache.
         We store data off heap and all in ram with default ignite
    serialization. We just create table, then populate data with
    default configuration in ignite, query by sql(one node,  4
    million records ).
         Is there anyway can improve query performance ?

    Mikael <[email protected]
    <mailto:[email protected]>> 于2019年11月21日周四
    下午5:02写道:

        Hi!

        The comparison is not of much use, when you talk about
        ignite, it's not
        just to search a list, there is serialization/deserialization
        and other
        things to consider that will make it slower compared to a
        simple list
        search, a linear search on an Ignite cache depends on how you
        store data
        (off heap/on heap, in ram/partially on disk, type of
        serialization and
        so on.

        If you cannot keep all data in ram you are going to need some
        index to
        do a fast lookup, there is no way around it.

        If you can have all the data in ram, why do you need Ignite ?
        do you
        have some other requirements for it that Ignite gives you ?
        otherwise it
        might be simpler to just use a list in ram and go with that ?

        Is memory a limitation (cluster or single node ?) ? if not,
        could you
        explain why is it difficult to create an index on the data ?

        Could you explain what type of data it is ? maybe it is
        possible to
        arrange the data in some other way to improve everything

        Did you test with a single node or a cluster of nodes ? with
        more nodes
        you can improve performance as any search can be split up
        between the
        nodes, still, some kind of index will help a lot.

        Mikael

        Den 2019-11-21 kl. 08:49, skrev c c:
        > HI,
        >      We have a table with about 30 million records and 15
        fields. We
        > need implement function that user can filter record by
        arbitrary 12
        > fields( one,two, three...of them) with very low latency. It's
        > difficult to create index. We think ignite is a grid memory
        cache and
        > test it with 4 million records(one node) without creating
        index. It
        > took about 5 seconds to find a record match one field filter
        > condition. We have tested just travel a java List(10
        million elements)
        > with 3 filter condition. It took about 0.1 second. We just
        want to
        > know whether ignite suit this use case? Thanks very much.
        >

Reply via email to