Re: Monitoring long / stuck CTAS

2015-05-29 Thread Carol McDonald
What Ted just talked about is also explained in this On Demand Training https://www.mapr.com/services/mapr-academy/mapr-distribution-essentials-training-course-on-demand (which is free) On Fri, May 29, 2015 at 5:29 PM, Ted Dunning wrote: > There are two methods to support HBase table API's.

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Ted Dunning
There are two methods to support HBase table API's. The first is to simply run HBase. That is just like, well, running HBase. The more interesting alternative is to use a special client API that talks a special table-oriented wire protocol to the file system which implements a column-family / col

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Matt
I have another test case that queries a table using a filter of a range of dates and customer key, that SUMs 38 columns. The returned record set encompasses all 42 columns in the table - not a good design for parquet files or any RDBMS, but a modeling problem that is not yet fully in my control

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Sudheesh Katkam
See below: > On May 27, 2015, at 12:17 PM, Matt wrote: > > Attempting to create a Parquet backed table with a CTAS from an 44GB tab > delimited file in HDFS. The process seemed to be running, as CPU and IO was > seen on all 4 nodes in this cluster, and .parquet files being created in the > ex

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Matt
> 1) it isn't HDFS. Is MapR-FS a replacement or stand-in for HDFS? On 29 May 2015, at 5:55, Ted Dunning wrote: > Apologies for the plug, but using MapR FS would help you a lot here. The > trick is that you can run an NFS server on every node and mount that server > as localhost. > > The benefi

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Yousef Lasi
Could you expand on the HBase table integration? How does that work? On Fri, May 29, 2015 at 5:55 AM, Ted Dunning wrote: > > 4) you get the use of the HBase API without having to run HBase. Tables > are integrated directly into MapR FS. > > > > > > On Thu, May 28, 2015 at 9:37 AM, Matt wrote:

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Ted Dunning
Apologies for the plug, but using MapR FS would help you a lot here. The trick is that you can run an NFS server on every node and mount that server as localhost. The benefits are: 1) the entire cluster appears as a conventional POSIX style file system in addition to being available via HDFS API

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
Bumping memory to: DRILL_MAX_DIRECT_MEMORY="16G" DRILL_HEAP="8G" The 44GB file imported successfully in 25 minutes - acceptable on this hardware. I don't know if the default memory setting was to blame or not. On 28 May 2015, at 14:22, Andries Engelbrecht wrote: That is the Drill direct m

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
That is a good point. The difference between the number of source rows, and those that made it into the parquet files is about the same count as the other fragments. Indeed the query profile does show fragment 1_1 as CANCELED while the others all have State FINISHED. Additionally the other fra

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Mehant Baid
I think the problem might be related to a single laggard, looks like we are waiting for one minor fragment to complete. Based on the output you provided looks like the fragment 1_1 hasn't completed. You might want to find out where the fragment was scheduled and what is going on in that node. I

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Andries Engelbrecht
That is the Drill direct memory per node. DRILL_HEAP is for the heap size per node. More info here http://drill.apache.org/docs/configuring-drill-memory/ —Andries On May 28, 2015, at 11:09 AM, Matt wrote: > Referencing http://drill.apache.org/docs/configuring-drill-memory/ > > Is DRILL_MAX_

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
Referencing http://drill.apache.org/docs/configuring-drill-memory/ Is DRILL_MAX_DIRECT_MEMORY the limit for each node, or the cluster? The root page on a drillbit at port 8047 list for nodes, with the 16G Maximum Direct Memory equal to DRILL_MAX_DIRECT_MEMORY, thus uncertain if that is a node

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
Did you check the log files for any errors? No messages related to this query containing errors or warning, nor nothing mentioning memory or heap. Querying now to determine what is missing in the parquet destination. drillbit.out on the master shows no error messages, and what looks like th

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Andries Engelbrecht
It should execute multi threaded, need to check on text file. Did you check the log files for any errors? On May 28, 2015, at 10:36 AM, Matt wrote: >> The time seems pretty long for that file size. What type of file is it? > > Tab delimited UTF-8 text. > > I left the query to run overnight t

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
CPU and IO went to near zero on the master and all nodes after about 1 hour. I am do not know if the bulk of rows were written within that hour or after. Is there any way you can read the table and try to validate if all of the data was written? A simple join will show me where it stopped, a

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
The time seems pretty long for that file size. What type of file is it? Tab delimited UTF-8 text. I left the query to run overnight to see if it would complete, but 24 hours for an import like this would indeed be too long. Is the CTAS running single threaded? In the first hour, with this

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Jason Altekruse
He mentioned in his original post that he saw CPU and IO on all of the nodes for a while when the query was active, but it suddenly dropped down to low CPU usage and stopped producing files. It seems like we are failing to detect an error an cancel the query. It is possible that the failure happen

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Andries Engelbrecht
The time seems pretty long for that file size. What type of file is it? Is the CTAS running single threaded? —Andries On May 28, 2015, at 9:37 AM, Matt wrote: >> How large is the data set you are working with, and your cluster/nodes? > > Just testing with that single 44GB source file current

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
How large is the data set you are working with, and your cluster/nodes? Just testing with that single 44GB source file currently, and my test cluster is made from 4 nodes, each with 8 CPU cores, 32GB RAM, a 6TB Ext4 volume (RAID-10). Drill defaults left as come in v1.0. I will be adjusting m

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Andries Engelbrecht
Just check the drillbit.log and drillbit.out files in the log directory. Before adjusting memory, see if that is an issue first. It was for me, but as Jason mentioned there can be other causes as well. You adjust memory allocation in the drill-env.sh files, and have to restart the drill bits. H

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Jason Altekruse
That is correct. I guess it could be possible that HDFS might run out of heap, but I'm guessing that is unlikely the cause of the failure you are seeing. We should not be taxing zookeeper enough to be causing any issues there. On Thu, May 28, 2015 at 9:17 AM, Matt wrote: > To make sure I am adju

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
I did not note any memory errors or warnings in a quick scan of the logs, but to double check, is there a specific log I would find such warnings in? > On May 28, 2015, at 12:01 PM, Andries Engelbrecht > wrote: > > I have used a single CTAS to create tables using parquet with 1.5B rows. > >

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
To make sure I am adjusting the correct config, these are heap parameters within the Drill configure path, not for Hadoop or Zookeeper? > On May 28, 2015, at 12:08 PM, Jason Altekruse > wrote: > > There should be no upper limit on the size of the tables you can create > with Drill. Be advised

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Jason Altekruse
There should be no upper limit on the size of the tables you can create with Drill. Be advised that Drill does currently operate entirely optimistically in regards to available resources. If a network connection between two drillbits fails during a query, we will not currently re-schedule the work

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Andries Engelbrecht
I have used a single CTAS to create tables using parquet with 1.5B rows. It did consume a lot of heap memory on the Drillbits and I had to increase the heap size. Check your logs to see if you are running out of heap memory. I used 128MB parquet block size. This was with Drill 0.9 , so I’m sure

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
Is 300MM records too much to do in a single CTAS statement? After almost 23 hours I killed the query (^c) and it returned: ~~~ +---++ | Fragment | Number of records written | +---++ | 1_20 | 13568824

Monitoring long / stuck CTAS

2015-05-27 Thread Matt
Attempting to create a Parquet backed table with a CTAS from an 44GB tab delimited file in HDFS. The process seemed to be running, as CPU and IO was seen on all 4 nodes in this cluster, and .parquet files being created in the expected path. In however in the last two hours or so, all nodes sho