Thanks Shi! The table was accidentally empty, it was the reason for the same.
On Wed, Jul 1, 2015 at 8:09 AM, Shi, Shaofeng <[email protected]> wrote: > This error usually caused by there is no record in the selected time > range; You can verify this by checking the “data size” of the first job > step, if it is very small that is the case; Since no data in the flat hive > table, in the second step there will be no distinct values be output, then > in the third step it will report file not found error; Please check your > hive table and selected date range; > > On 6/30/15, 8:30 PM, "Vineet Mishra" <[email protected]> wrote: > > >Hi, > > > >Runnning/Scheduling multiple jobs at once is killing all the other jobs > >except only one. > > > >So I have three cubes and to build cube I have corresponding build jobs, > >Its failing at third step Build Dimension Dictionary with FileNotFound > >exception with > > > >java.io.FileNotFoundException: File does not exist: > >/tmp/kylin-65abae6a-72e0-4b59-880b-ece8ab49b33b/sc_sd_esd_diff1/fact_disti > >nct_columns/cn > > > >java.io.FileNotFoundException: File does not exist: > >/tmp/kylin-b9145673-de15-4304-8c99-431618219c28/sc_o2s_metrics_verified/fa > >ct_distinct_columns/sc > > > >Any suggestions would be highly appreciated! > > > >Thanks, > > > >On Wed, Jun 24, 2015 at 9:32 PM, Vineet Mishra <[email protected]> > >wrote: > > > >> Thanks Shi! > >> > >> On Wed, Jun 24, 2015 at 12:59 PM, Shi, Shaofeng <[email protected]> > >>wrote: > >> > >>> Yes purge can also be requested via REST API, see the API list: > >>> > >>> > >>> > >>> > https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20 > >>>Res > >>> tful%20API%20List.md > >>> > >>> > >>> On 6/24/15, 3:07 PM, "Vineet Mishra" <[email protected]> wrote: > >>> > >>> >Hi Shi, > >>> > > >>> >For my use case, its like the data can change throughout from the very > >>> >initial for every next day as the hive table is truncate load from > >>> scratch > >>> >and the process is meant to be like that. I guess in that case purging > >>> the > >>> >cube and rebuilding from the scratch would be better and only option. > >>> > > >>> >Referring to the mentioned url for the kylin api > >>> > > >>> > > >>> > >>> > https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20 > >>>Cu > >>> >be%20with%20Restful%20API.md > >>> > > >>> >is it even possible to purge the cube and rebuild through the api, as > >>>I > >>> >can > >>> >only see build and merge option mentioned in the api. > >>> > > >>> >Thanks! > >>> > > >>> > > >>> >On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <[email protected]> > >>>wrote: > >>> > > >>> >> Hi Vineet, > >>> >> > >>> >> It can vary depends on your scenario: > >>> >> > >>> >> Say you have build the data from 23 May (inclusive) to 23 June > >>> >> (exclusive); and Now the data of 23 June loads into hive; If the > >>> >>historic > >>> >> data (23 May to 23 June) in hive will not change, you don’t need to > >>> >>build > >>> >> that again; You just need build a new date range from 23 to 24 June; > >>> >>After > >>> >> the build, there will be two cube “segments”: one is for the past > >>> month, > >>> >> and the second is for the 23 June to 24; we call this as > >>>“incremental > >>> >> build”; Kylin will scan all cube segments (each segment is a hbase > >>> >>table) > >>> >> when executing a SQL query, so with a big full build or multiple > >>> >> incremental builds you will get the same query result; We suggest > >>>use > >>> >> incremental build as that will save resource/time on the cube build; > >>> >> > >>> >> But if your data in hive will change and you expects cube data be > >>>sync > >>> >> with hive, you need refresh the historic cube segment, or rebuilt > >>>the > >>> >> whole data range each time; > >>> >> > >>> >> > >>> >> On 6/23/15, 5:40 PM, "Vineet Mishra" <[email protected]> > wrote: > >>> >> > >>> >> >Hi Shi, > >>> >> > > >>> >> >Referring to the above link, I want to refresh the cube for each > >>>day's > >>> >> >corresponding last month's data. > >>> >> > > >>> >> >So my requirement is something like today I wan't to build the cube > >>> for > >>> >> >last one month data that is from 23 May to 23 June and tomorrow I > >>>will > >>> >>be > >>> >> >requiring the cube for the date range of 24 May to 24 June and so > >>>on. > >>> >> > > >>> >> >Can you shadow me as in that case how to use the mentioned API and > >>> move > >>> >> >forward? Will it still be considering Refresh build or full cube > >>> build. > >>> >> > > >>> >> >Thanks! > >>> >> > > >>> >> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra < > >>> [email protected]> > >>> >> >wrote: > >>> >> > > >>> >> >> Hi Shi, > >>> >> >> > >>> >> >> I am not aware off the basic authentication mechanism mentioned > >>> here, > >>> >> >> could you help me out as how could I schedule my cube refresh > >>>using > >>> >>the > >>> >> >>API. > >>> >> >> > >>> >> >> I tried java URL Connection for basic authentication but couldn't > >>> get > >>> >> >>any > >>> >> >> cookies to move further. > >>> >> >> > >>> >> >> Thanks, > >>> >> >> > >>> >> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <[email protected] > > > >>> >> wrote: > >>> >> >> > >>> >> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or > >>> >> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to > >>> >>update an > >>> >> >>> existing cube segment, please check: > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> > >>> > >>> > https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20 > >>> >> >>>Cub > >>> >> >>> e%20with%20Restful%20API.md > >>> >> >>> > >>> >> >>> > >>> >> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <[email protected]> > >>> wrote: > >>> >> >>> > >>> >> >>> >Hi, > >>> >> >>> > > >>> >> >>> >Is there a way so that I can schedule the cube creation by > >>>purging > >>> >>the > >>> >> >>> >older cube and creating the new cube everyday taking the same > >>> >>window > >>> >> >>> >interval say around a month or so. > >>> >> >>> > > >>> >> >>> >So my requirement is pretty straightforward, I want to build a > >>> cube > >>> >> >>> >considering for the last one month data set and which should > >>> >>refresh > >>> >> >>> every > >>> >> >>> >day with last one month data from the particular date. > >>> >> >>> > > >>> >> >>> >Thanks! > >>> >> >>> > >>> >> >>> > >>> >> >> > >>> >> > >>> >> > >>> > >>> > >> > >
