HI, We have some filter condition like this: age >= 18 and level =1 and gender = 1; (age >= 18 or level = 2) and hobby = 'music' If one cache for each column, join the result is complicate. Is there any way make searching without index fast?
Mikael <[email protected]> 于2019年11月21日周四 下午8:08写道: > Hi! > > One idea would be to have one cache for each column, so the key is name > and value is the hobby for example, you get an index on the key for "free" > and create one index on the value. > > If the cache does not contain name that person does not have a hobby, only > names that does have a hobby is in the cache, it would complicate the query > a bit and you need to ask multiple queries for each column, but updating of > index is fast as you only need to update one index for each cache if you > only update a few columns, if you need to update all it will of course > still need to update the index for all caches, I am not sure if that would > work for you, it depends on what kind of queries you need. > > In theory you could have 15 nodes and have one cache on each node and ask > queries in parallel. > > I am not at all sure it will work well, it's just an idea. > > Mikael > > > Den 2019-11-21 kl. 12:17, skrev c c: > > yes, we may add more columns in the future. You mean creating index create > on one column or multiple columns? And some columns value difference are > not big. So many index is not efficient and will cost a lot of ram and > decrease update or insert performance(this table may udpate real time). So > we think just traveling collection in memory is good. And cache is scalable > will get rid of ram limit and make filter more quick. > > Mikael <[email protected]> 于2019年11月21日周四 下午7:06写道: > >> Hi! >> >> Are the queries limited to something like "select name from ... where >> hobby=x and location=y..." or you need more complex queries ? >> >> If the columns are fixed to 15, I don't see why you could not create 15 >> indices, it would use lots of ram and I don't think it's the best solution >> either but it should work. >> >> Is it fixed to 15 columns ? or will you have to add more columns in the >> future ? >> >> Den 2019-11-21 kl. 10:56, skrev c c: >> >> HI, Mikael >> Thanks for you reply very much! >> The type of data like this: >> member [name, location, age, gender, hobby, level, credits, expense >> ...] >> We need filter data by arbitrary fileds combination, so creating >> index is not of much use. We thought traveling all data in memory works >> better. >> We can keep all data in ram, but data may increase progressisvely, >> single node is not scalable. So we plan to use a distribute memory cache. >> We store data off heap and all in ram with default ignite >> serialization. We just create table, then populate data with default >> configuration in ignite, query by sql(one node, 4 million records ). >> Is there anyway can improve query performance ? >> >> Mikael <[email protected]> 于2019年11月21日周四 下午5:02写道: >> >>> Hi! >>> >>> The comparison is not of much use, when you talk about ignite, it's not >>> just to search a list, there is serialization/deserialization and other >>> things to consider that will make it slower compared to a simple list >>> search, a linear search on an Ignite cache depends on how you store data >>> (off heap/on heap, in ram/partially on disk, type of serialization and >>> so on. >>> >>> If you cannot keep all data in ram you are going to need some index to >>> do a fast lookup, there is no way around it. >>> >>> If you can have all the data in ram, why do you need Ignite ? do you >>> have some other requirements for it that Ignite gives you ? otherwise it >>> might be simpler to just use a list in ram and go with that ? >>> >>> Is memory a limitation (cluster or single node ?) ? if not, could you >>> explain why is it difficult to create an index on the data ? >>> >>> Could you explain what type of data it is ? maybe it is possible to >>> arrange the data in some other way to improve everything >>> >>> Did you test with a single node or a cluster of nodes ? with more nodes >>> you can improve performance as any search can be split up between the >>> nodes, still, some kind of index will help a lot. >>> >>> Mikael >>> >>> Den 2019-11-21 kl. 08:49, skrev c c: >>> > HI, >>> > We have a table with about 30 million records and 15 fields. We >>> > need implement function that user can filter record by arbitrary 12 >>> > fields( one,two, three...of them) with very low latency. It's >>> > difficult to create index. We think ignite is a grid memory cache and >>> > test it with 4 million records(one node) without creating index. It >>> > took about 5 seconds to find a record match one field filter >>> > condition. We have tested just travel a java List(10 million elements) >>> > with 3 filter condition. It took about 0.1 second. We just want to >>> > know whether ignite suit this use case? Thanks very much. >>> > >>> >>
