You get it :D This is the real issue. However it's quite an extreme case. If you can guarantee that there will be a minimum X articles per day and per country, the maximum number of request to fetch 100 articles will be bounded.
Furthermore, do not forget that SELECT statement using a partition key will leverage bloom filters so in case of true negative (no article for a day) Cassandra will not touch disk On Thu, Jan 22, 2015 at 9:30 PM, SEGALIS Morgan <msega...@gmail.com> wrote: > Oh yeah, I though about it, even raised the reflexion on the first mail, > > "Let's say I want to show only 100 of the newer articles, I'll get the > today's articles, and if it does not fill the request (too few articles), > I'll check the day before that, etc..." > > but your answer raised another issue I did not though of before : > - going back on previous days, let's say I want 100 newest articles > - If there is at most 1 article per day, and some 0, I will have do more > 100+ queries to get all the posts, won't it be a little too much ? > > 2015-01-22 20:47 GMT+01:00 DuyHai Doan <doanduy...@gmail.com>: > >> well, if the current day bucket does not contain enough article, you may >> need to search back in the previous day. If the previous day does not have >> any article, you may need to go back time a day before ... and so on ... >> >> Of course it's a corner case but I've seen some code that misses this >> scenario and ends up in an infinite loop back in time ... >> >> On Thu, Jan 22, 2015 at 8:41 PM, SEGALIS Morgan <msega...@gmail.com> >> wrote: >> >>> Hi DuyHai, >>> >>> if there is 0 article, the row will obviously not exist I guess... (no >>> article insertion will create the row) >>> What is bugging you exactly ? >>> >>> 2015-01-22 20:33 GMT+01:00 DuyHai Doan <doanduy...@gmail.com>: >>> >>>> Hello Morgan >>>> >>>> The data model looks reasonable. Bucketing by day will help you to >>>> scale. The only thing I can see is how to go back in time to fetch articles >>>> from previous buckets (previous days). It is possible to have 0 article for >>>> a country for a day ? >>>> >>>> >>>> On Thu, Jan 22, 2015 at 8:23 PM, SEGALIS Morgan <msega...@gmail.com> >>>> wrote: >>>> >>>>> Sorry, I copied/pasted the question from another platform where you >>>>> don't generally say hello, >>>>> >>>>> So : Hello everyone, >>>>> >>>>> >>>>> 2015-01-22 20:19 GMT+01:00 SEGALIS Morgan <msega...@gmail.com>: >>>>> >>>>>> I have a column family that store articles. I'll need to get those >>>>>> articles from the most recent to the oldest, getting them from Country, >>>>>> and >>>>>> of course the ability to limit the number of fetched articles. >>>>>> >>>>>> I though about another ColumnFamily "ArticlesByDateAndCountry" with >>>>>> dynamic columns >>>>>> >>>>>> The Key would a mix from the 2 Char country Code (ISO 3166-1), and >>>>>> the articles day's date so something like : US-20150118 or FR-20141230 -- >>>>>> (XX-YYYYMMDD) >>>>>> >>>>>> In those Row, the column name would be the timeuuid of the article, >>>>>> and the value is the article's ID. >>>>>> >>>>>> It would probably get a thousand of articles per day for each country. >>>>>> >>>>>> Let's say I want to show only 100 of the newer articles, I'll get the >>>>>> today's articles, and if it does not fill the request (too few articles), >>>>>> I'll check the day before that, etc... >>>>>> >>>>>> Is that the best practice, or does someone has a better idea for this >>>>>> purpose ? >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Morgan SEGALIS >>>>> >>>> >>>> >>> >>> >>> -- >>> Morgan SEGALIS >>> >> >> > > > -- > Morgan SEGALIS >