This error usually caused by there is no record in the selected time range; You can verify this by checking the “data size” of the first job step, if it is very small that is the case; Since no data in the flat hive table, in the second step there will be no distinct values be output, then in the third step it will report file not found error; Please check your hive table and selected date range;
On 6/30/15, 8:30 PM, "Vineet Mishra" <[email protected]> wrote: >Hi, > >Runnning/Scheduling multiple jobs at once is killing all the other jobs >except only one. > >So I have three cubes and to build cube I have corresponding build jobs, >Its failing at third step Build Dimension Dictionary with FileNotFound >exception with > >java.io.FileNotFoundException: File does not exist: >/tmp/kylin-65abae6a-72e0-4b59-880b-ece8ab49b33b/sc_sd_esd_diff1/fact_disti >nct_columns/cn > >java.io.FileNotFoundException: File does not exist: >/tmp/kylin-b9145673-de15-4304-8c99-431618219c28/sc_o2s_metrics_verified/fa >ct_distinct_columns/sc > >Any suggestions would be highly appreciated! > >Thanks, > >On Wed, Jun 24, 2015 at 9:32 PM, Vineet Mishra <[email protected]> >wrote: > >> Thanks Shi! >> >> On Wed, Jun 24, 2015 at 12:59 PM, Shi, Shaofeng <[email protected]> >>wrote: >> >>> Yes purge can also be requested via REST API, see the API list: >>> >>> >>> >>>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Kylin%20 >>>Res >>> tful%20API%20List.md >>> >>> >>> On 6/24/15, 3:07 PM, "Vineet Mishra" <[email protected]> wrote: >>> >>> >Hi Shi, >>> > >>> >For my use case, its like the data can change throughout from the very >>> >initial for every next day as the hive table is truncate load from >>> scratch >>> >and the process is meant to be like that. I guess in that case purging >>> the >>> >cube and rebuilding from the scratch would be better and only option. >>> > >>> >Referring to the mentioned url for the kylin api >>> > >>> > >>> >>>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20 >>>Cu >>> >be%20with%20Restful%20API.md >>> > >>> >is it even possible to purge the cube and rebuild through the api, as >>>I >>> >can >>> >only see build and merge option mentioned in the api. >>> > >>> >Thanks! >>> > >>> > >>> >On Wed, Jun 24, 2015 at 8:13 AM, Shi, Shaofeng <[email protected]> >>>wrote: >>> > >>> >> Hi Vineet, >>> >> >>> >> It can vary depends on your scenario: >>> >> >>> >> Say you have build the data from 23 May (inclusive) to 23 June >>> >> (exclusive); and Now the data of 23 June loads into hive; If the >>> >>historic >>> >> data (23 May to 23 June) in hive will not change, you don’t need to >>> >>build >>> >> that again; You just need build a new date range from 23 to 24 June; >>> >>After >>> >> the build, there will be two cube “segments”: one is for the past >>> month, >>> >> and the second is for the 23 June to 24; we call this as >>>“incremental >>> >> build”; Kylin will scan all cube segments (each segment is a hbase >>> >>table) >>> >> when executing a SQL query, so with a big full build or multiple >>> >> incremental builds you will get the same query result; We suggest >>>use >>> >> incremental build as that will save resource/time on the cube build; >>> >> >>> >> But if your data in hive will change and you expects cube data be >>>sync >>> >> with hive, you need refresh the historic cube segment, or rebuilt >>>the >>> >> whole data range each time; >>> >> >>> >> >>> >> On 6/23/15, 5:40 PM, "Vineet Mishra" <[email protected]> wrote: >>> >> >>> >> >Hi Shi, >>> >> > >>> >> >Referring to the above link, I want to refresh the cube for each >>>day's >>> >> >corresponding last month's data. >>> >> > >>> >> >So my requirement is something like today I wan't to build the cube >>> for >>> >> >last one month data that is from 23 May to 23 June and tomorrow I >>>will >>> >>be >>> >> >requiring the cube for the date range of 24 May to 24 June and so >>>on. >>> >> > >>> >> >Can you shadow me as in that case how to use the mentioned API and >>> move >>> >> >forward? Will it still be considering Refresh build or full cube >>> build. >>> >> > >>> >> >Thanks! >>> >> > >>> >> >On Tue, Jun 23, 2015 at 1:30 AM, Vineet Mishra < >>> [email protected]> >>> >> >wrote: >>> >> > >>> >> >> Hi Shi, >>> >> >> >>> >> >> I am not aware off the basic authentication mechanism mentioned >>> here, >>> >> >> could you help me out as how could I schedule my cube refresh >>>using >>> >>the >>> >> >>API. >>> >> >> >>> >> >> I tried java URL Connection for basic authentication but couldn't >>> get >>> >> >>any >>> >> >> cookies to move further. >>> >> >> >>> >> >> Thanks, >>> >> >> >>> >> >> On Wed, Jun 17, 2015 at 7:21 PM, Shi, Shaofeng <[email protected]> >>> >> wrote: >>> >> >> >>> >> >>> You don¹t need repeatedly create the cube; We call it ³BUILD² or >>> >> >>> ³REFRESH²: ³BUILD² is to build a new segment; ³REFRESH² is to >>> >>update an >>> >> >>> existing cube segment, please check: >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>>https://github.com/apache/incubator-kylin/blob/master/docs/REST/Build%20 >>> >> >>>Cub >>> >> >>> e%20with%20Restful%20API.md >>> >> >>> >>> >> >>> >>> >> >>> On 6/11/15, 6:38 PM, "Vineet Mishra" <[email protected]> >>> wrote: >>> >> >>> >>> >> >>> >Hi, >>> >> >>> > >>> >> >>> >Is there a way so that I can schedule the cube creation by >>>purging >>> >>the >>> >> >>> >older cube and creating the new cube everyday taking the same >>> >>window >>> >> >>> >interval say around a month or so. >>> >> >>> > >>> >> >>> >So my requirement is pretty straightforward, I want to build a >>> cube >>> >> >>> >considering for the last one month data set and which should >>> >>refresh >>> >> >>> every >>> >> >>> >day with last one month data from the particular date. >>> >> >>> > >>> >> >>> >Thanks! >>> >> >>> >>> >> >>> >>> >> >> >>> >> >>> >> >>> >>> >>
