Re: swap data in Kudu table

2018-08-04 Thread Boris
Thanks so much Tomas, glad you liked it. But as you might have seen another
thread already, the workaround I've described won't work with Impala 2.12
due to a breaking change.

On Thu, Aug 2, 2018, 07:18 far...@tf-bic.sk  wrote:

> Thanks Boris for a great article!
> Tomas
>
> On 2018/07/25 19:56:10, Boris Tyukin  wrote:
> > Hi guys,
> >
> > thanks again for your help!  I just blogged about this
> >
> https://boristyukin.com/how-to-hot-swap-apache-kudu-tables-with-apache-impala/
> >
> > BTW I did not have to invalidate or refresh metadata - it just worked
> with
> >  ALTER TABLE TBLPROPERTIES idea. We have one Kudu master on our dev
> cluster
> > so not sure if it is because of that but Impala/Kudu docs also do not
> > mention anything about metadata refresh.  Looks like Impala is keeping a
> > reference to uuid of the Kudu table not its actual name.
> >
> > One thing I am still puzzled is how Impala was able to finish my
> > long-running SELECT statement, that I had kicked off right before the
> swap.
> > I did not get any error messages and I could clearly see that Kudu tables
> > were getting renamed and dropped, while the query was still running in a
> > different session and completed 10 seconds after the swap. This is still
> a
> > mystery to me. The only explanation I have is that data was already in
> > Impala daemons memory and did not need Kudu tables at that point.
> >
> > Boris
> >
> >
> >
> > On Fri, Feb 23, 2018 at 5:13 PM Boris Tyukin 
> wrote:
> >
> > > you are guys are awesome, thanks!
> > >
> > > Todd, I like ALTER TABLE TBLPROPERTIES idea - will test it next week.
> > > Views might work as well but for a number of reasons want to keep it
> as my
> > > last resort :)
> > >
> > > On Fri, Feb 23, 2018 at 4:32 PM, Todd Lipcon 
> wrote:
> > >
> > >> A couple other ideas from the Impala side:
> > >>
> > >> - could you use a view and alter the view to point to a different
> table?
> > >> Then all readers would be pointed at the view, and security
> permissions
> > >> could be on that view rather than the underlying tables?
> > >>
> > >> - I think if you use an external table in Impala you could use an
> ALTER
> > >> TABLE TBLPROPERTIES ... statement to change kudu.table_name to point
> to a
> > >> different table. Then issue a 'refresh' on the impalads so that they
> load
> > >> the new metadata. Subsequent queries would hit the new underlying Kudu
> > >> table, but permissions and stats would be unchanged.
> > >>
> > >> -Todd
> > >>
> > >> On Fri, Feb 23, 2018 at 1:16 PM, Mike Percy 
> wrote:
> > >>
> > >>> Hi Boris, those are good ideas. Currently Kudu does not have atomic
> bulk
> > >>> load capabilities or staging abilities. Theoretically renaming a
> partition
> > >>> atomically shouldn't be that hard to implement, since it's just a
> master
> > >>> metadata operation which can be done atomically, but it's not yet
> > >>> implemented.
> > >>>
> > >>> There is a JIRA to track a generic bulk load API here:
> > >>> https://issues.apache.org/jira/browse/KUDU-1370
> > >>>
> > >>> Since I couldn't find anything to track the specific features you
> > >>> mentioned, I just filed the following improvement JIRAs so we can
> track it:
> > >>>
> > >>>- KUDU-2326: Support atomic bulk load operation
> > >>>
> > >>>- KUDU-2327: Support atomic swap of tables or partitions
> > >>>
> > >>>
> > >>> Mike
> > >>>
> > >>> On Thu, Feb 22, 2018 at 6:39 AM, Boris Tyukin  >
> > >>> wrote:
> > >>>
> >  Hello,
> > 
> >  I am trying to figure out the best and safest way to swap data in a
> >  production Kudu table with data from a staging table.
> > 
> >  Basically, once in a while we need to perform a full reload of some
> >  tables (once in a few months). These tables are pretty large with
> billions
> >  of rows and we want to minimize the risk and downtime for users if
> >  something bad happens in the middle of that process.
> > 
> >  With Hive and Impala on HDFS, we can use a very cool handy command
> LOAD
> >  DATA INPATH. We can prepare data for reload in a staging table
> upfront and
> >  this process might take many hours. Once staging table is ready, we
> can
> >  issue LOAD DATA INPATH command which will move underlying HDFS
> files to a
> >  production table - this operation is almost instant and the very
> last step
> >  in our pipeline.
> > 
> >  Alternatively, we can swap partitions using ALTER TABLE EXCHANGE
> >  PARTITION command.
> > 
> >  Now with Kudu, I cannot seem to find a good strategy. The only thing
> >  came to my mind is to drop the production table and rename a
> staging table
> >  to production table as the last step of the job, but in this case
> we are
> >  going to lose statistics and security permissions.
> > 
> >  Any other ideas?
> > 
> >  Thanks!
> >  

Re: Re: Recommended maximum amount of stored data per tablet server

2018-08-04 Thread Boris Tyukin
How much space typically allocated just for WAL and metadata? We have 2
400GB ssds in raid5 for OS and 12 12TB hdds. Is it still a good idea to
carve out maybe 100gb on SSD or use a dedicated hdd

On Thu, Aug 2, 2018, 20:36 Todd Lipcon  wrote:

> On Thu, Aug 2, 2018 at 4:54 PM, Quanlong Huang 
> wrote:
>
>> Thank Adar and Todd! We'd like to contribute when we could.
>>
>> Are there any concerns if we share the machines with HDFS DataNodes and
>> Yarn NodeManagers? The network bandwidth is 10Gbps. I think it's ok if they
>> don't share the same disks, e.g. 4 disks for kudu and the other 11 disks
>> for DataNode and NodeManager, and leave enough CPU & mem for kudu. Is that
>> right?
>>
>
> That should be fine. Typically we actualyl recommend sharing all the disks
> for all of the services. There is a trade-off between static partitioning
> (exclusive access to a smaller number of disks) vs dynamic sharing
> (potential contention but more available resources). Unless your workload
> is very latency sensitive I usually think it's better to have the bigger
> pool of resources available even if it needs to share with other systems.
>
> One recommendation, though is to consider using a dedicated disk for the
> Kudu WAL and metadata, which can help performance, since the WAL can be
> sensitive to other heavy workloads monopolizing bandwidth on the same
> spindle.
>
> -Todd
>
>>
>> At 2018-08-03 02:26:37, "Todd Lipcon"  wrote:
>>
>> +1 to what Adar said.
>>
>> One tension we have currently for scaling is that we don't want to scale
>> individual tablets too large, because of problems like the superblock that
>> Adar mentioned. However, the solution of just having more tablets is also
>> not a great one, since many of our startup time problems are primarily
>> affected by the number of tablets more than their size (see KUDU-38 as the
>> prime, ancient, example). Additionally, having lots of tablets increases
>> raft heartbeat traffic and may need to dial back those heartbeat intervals
>> to keep things stable.
>>
>> All of these things can be addressed in time and with some work. If you
>> are interested in working on these areas to improve density that would be a
>> great contribution.
>>
>> -Todd
>>
>>
>>
>> On Thu, Aug 2, 2018 at 11:17 AM, Adar Lieber-Dembo 
>> wrote:
>>
>>> The 8TB limit isn't a hard one, it's just a reflection of the scale
>>> that Kudu developers commonly test. Beyond 8TB we can't vouch for
>>> Kudu's stability and performance. For example, we know that as the
>>> amount of on-disk data grows, node restart times get longer and longer
>>> (see KUDU-2014 for some ideas on how to improve that). Furthermore, as
>>> tablets accrue more data blocks, their superblocks become larger,
>>> raising the minimum amount of I/O for any operation that rewrites a
>>> superblock (such as a flush or compaction). Lastly, the tablet copy
>>> protocol used in rereplication tries to copy the entire superblock in
>>> one RPC message; if the superblock is too large, it'll run up against
>>> the default 50 MB RPC transfer size (see src/kudu/rpc/transfer.cc).
>>>
>>> These examples are just off the top of my head; there may be others
>>> lurking. So this goes back to what I led with: beyond the recommended
>>> limit we aren't quite sure how Kudu's performance and stability are
>>> affected.
>>>
>>> All that said, you're welcome to try it out and report back with your
>>> findings.
>>>
>>>
>>> On Thu, Aug 2, 2018 at 7:23 AM Quanlong Huang 
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > In the document of "Known Issues and Limitations", it's recommended
>>> that "maximum amount of stored data, post-replication and post-compression,
>>> per tablet server is 8TB". How is the 8TB calculated?
>>> >
>>> > We have some machines each with 15 * 4TB spinning disk drives and
>>> 256GB RAM, 48 cpu cores. Does it mean the other 52(= 15 * 4 - 8) TB space
>>> is recommended to leave for other systems? We prefer to make the machine
>>> dedicated to Kudu. Can tablet server leverage the whole space efficiently?
>>> >
>>> > Thanks,
>>> > Quanlong
>>>
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Dictionary encoding

2018-08-04 Thread Saeid Sattari
Hi Kudu community,

Does any body know what is the maximum distinct values of a String column
that Kudu considers in order to set its encoding to Dictionary? Many thanks
:)

br,