The rendering tool renders a portion a very large image. It may fetch different data each time from billions of rows. So I don't think I can cache such large results. Since same results will rarely fetched again.
Also do you know how I can do 2d range queries using Cassandra. Some other users suggested me using Solr. But is there any way I can achieve that without using any other technology. On Wed, Mar 18, 2015 at 4:33 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > Sorry, meant to say "that way when you have to render, you can just > display the latest cache." > > On Wed, Mar 18, 2015 at 1:30 PM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >> I would probably do this in a background thread and cache the results, >> that way when you have to render, you can just cache the latest results. >> >> I don't know why Cassandra can't seem to be able to fetch large batch >> sizes, I've also run into these timeouts but reducing the batch size to 2k >> seemed to work for me. >> >> On Wed, Mar 18, 2015 at 1:24 PM, Mehak Mehta <meme...@cs.stonybrook.edu> >> wrote: >> >>> We have UI interface which needs this data for rendering. >>> So efficiency of pulling this data matters a lot. It should be fetched >>> within a minute. >>> Is there a way to achieve such efficiency >>> >>> >>> On Wed, Mar 18, 2015 at 4:06 AM, Ali Akhtar <ali.rac...@gmail.com> >>> wrote: >>> >>>> Perhaps just fetch them in batches of 1000 or 2000? For 1m rows, it >>>> seems like the difference would only be a few minutes. Do you have to do >>>> this all the time, or only once in a while? >>>> >>>> On Wed, Mar 18, 2015 at 12:34 PM, Mehak Mehta < >>>> meme...@cs.stonybrook.edu> wrote: >>>> >>>>> yes it works for 1000 but not more than that. >>>>> How can I fetch all rows using this efficiently? >>>>> >>>>> On Wed, Mar 18, 2015 at 3:29 AM, Ali Akhtar <ali.rac...@gmail.com> >>>>> wrote: >>>>> >>>>>> Have you tried a smaller fetch size, such as 5k - 2k ? >>>>>> >>>>>> On Wed, Mar 18, 2015 at 12:22 PM, Mehak Mehta < >>>>>> meme...@cs.stonybrook.edu> wrote: >>>>>> >>>>>>> Hi Jens, >>>>>>> >>>>>>> I have tried with fetch size of 10000 still its not giving any >>>>>>> results. >>>>>>> My expectations were that Cassandra can handle a million rows >>>>>>> easily. >>>>>>> >>>>>>> Is there any mistake in the way I am defining the keys or querying >>>>>>> them. >>>>>>> >>>>>>> Thanks >>>>>>> Mehak >>>>>>> >>>>>>> On Wed, Mar 18, 2015 at 3:02 AM, Jens Rantil <jens.ran...@tink.se> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Try setting fetchsize before querying. Assuming you don't set it >>>>>>>> too high, and you don't have too many tombstones, that should do it. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Jens >>>>>>>> >>>>>>>> – >>>>>>>> Skickat från Mailbox <https://www.dropbox.com/mailbox> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Mar 18, 2015 at 2:58 AM, Mehak Mehta < >>>>>>>> meme...@cs.stonybrook.edu> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I have requirement to fetch million row as result of my query >>>>>>>>> which is giving timeout errors. >>>>>>>>> I am fetching results by selecting clustering columns, then why >>>>>>>>> the queries are taking so long. I can change the timeout settings but >>>>>>>>> I >>>>>>>>> need the data to fetched faster as per my requirement. >>>>>>>>> >>>>>>>>> My table definition is: >>>>>>>>> *CREATE TABLE images.results (uuid uuid, analysis_execution_id >>>>>>>>> varchar, analysis_execution_uuid uuid, x double, y double, loc >>>>>>>>> varchar, w >>>>>>>>> double, h double, normalized varchar, type varchar, filehost varchar, >>>>>>>>> filename varchar, image_uuid uuid, image_uri varchar, image_caseid >>>>>>>>> varchar, >>>>>>>>> image_mpp_x double, image_mpp_y double, image_width double, >>>>>>>>> image_height >>>>>>>>> double, objective double, cancer_type varchar, Area float, >>>>>>>>> submit_date >>>>>>>>> timestamp, points list<double>, PRIMARY KEY >>>>>>>>> ((image_caseid),Area,uuid));* >>>>>>>>> >>>>>>>>> Here each row is uniquely identified on the basis of unique uuid. >>>>>>>>> But since my data is generally queried based upon *image_caseid *I >>>>>>>>> have made it partition key. >>>>>>>>> I am currently using Java Datastax api to fetch the results. But >>>>>>>>> the query is taking a lot of time resulting in timeout errors: >>>>>>>>> >>>>>>>>> Exception in thread "main" >>>>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All >>>>>>>>> host(s) >>>>>>>>> tried for query failed (tried: localhost/127.0.0.1:9042 >>>>>>>>> (com.datastax.driver.core.exceptions.DriverException: Timed out >>>>>>>>> waiting for >>>>>>>>> server response)) >>>>>>>>> at >>>>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84) >>>>>>>>> at >>>>>>>>> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:289) >>>>>>>>> at >>>>>>>>> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:205) >>>>>>>>> at >>>>>>>>> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52) >>>>>>>>> at QueryDB.queryArea(TestQuery.java:59) >>>>>>>>> at TestQuery.main(TestQuery.java:35) >>>>>>>>> Caused by: >>>>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All >>>>>>>>> host(s) >>>>>>>>> tried for query failed (tried: localhost/127.0.0.1:9042 >>>>>>>>> (com.datastax.driver.core.exceptions.DriverException: Timed out >>>>>>>>> waiting for >>>>>>>>> server response)) >>>>>>>>> at >>>>>>>>> com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) >>>>>>>>> at >>>>>>>>> com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179) >>>>>>>>> at >>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>>>>> at >>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>>>>> at java.lang.Thread.run(Thread.java:744) >>>>>>>>> >>>>>>>>> Also when I try the same query on console even while using limit >>>>>>>>> of 2000 rows: >>>>>>>>> >>>>>>>>> cqlsh:images> select count(*) from results where >>>>>>>>> image_caseid='TCGA-HN-A2NL-01Z-00-DX1' and Area<100 and Area>20 limit >>>>>>>>> 2000; >>>>>>>>> errors={}, last_host=127.0.0.1 >>>>>>>>> >>>>>>>>> Thanks and Regards, >>>>>>>>> Mehak >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >