cess/content/understanding-administering-compactions.html
>
> Also if everything else fails, you can still issue the ALTER TABLE command
> periodically using crontab. Running extra compaction will not hurt that
> much.
>
> Thanks,
> Peter
>
> On Jun 2, 2020, at 14:25, Da
. 2 juin 2020 à 12:57, Peter Vary a écrit :
> Hi David,
>
> You do not really need to run compaction every time.
> Is it possible to wait for the compaction to start automatically next time?
>
> Thanks,
> Peter
>
> On Jun 2, 2020, at 12:51, David Morin wrote:
>
>
This looks very confusing when looking at
> the logs."
>
> Thanks,
> Peter
>
> On Jun 2, 2020, at 11:44, David Morin wrote:
>
> I don't get it.
> The transaction id in the error message "No delta files or original files
> found to compact in hdfs://... with mi
compaction for the current database/table
On 2020/06/01 20:13:08, David Morin wrote:
> Hi,
>
> I have a compaction issue on my cluster. When I force a compaction (major) on
> one table I get this error in Metastore logs:
>
> 2020-06-01 19:49:35,512 ERROR [-78]: compactor.Compacto
Hi,
I have a compaction issue on my cluster. When I force a compaction (major) on
one table I get this error in Metastore logs:
2020-06-01 19:49:35,512 ERROR [-78]: compactor.CompactorMR
(CompactorMR.java:run(264)) - No delta files or original files found to compact
in
Le jeu. 6 févr. 2020 à 12:12, David Morin a
écrit :
> ok, Peter
> No problem. Thx
> I'll keep you in touch
>
> On 2020/02/06 09:42:39, Peter Vary wrote:
> > Hi David,
> >
> > I more familiar with ACID v2 :(
> > What I would do is to run an update operat
ice to hear back from you if you found something.
>
> Thanks,
> Peter
>
> > On Feb 5, 2020, at 16:55, David Morin wrote:
> >
> > Hello,
> >
> > Thanks.
> > In fact I use HDP 2.6.5 and previous Orc version with transactionid for
> > exampl
ws. Only insert and delete. So update
> is handled as delete (old) row, insert (new/independent) row.
> The delete is stored in the delete delta directories., and the file do not
> have to contain the {row} struct at the end.
>
> Hope this helps,
> Peter
>
> > On Feb 5
73_0199073_
hdfs:///delta_0199073_0199073_0002
And the first one contains updates (operation:1) and the second one, inserts
(operation:0)
Thanks for your help
David
On 2019/12/01 16:57:08, David Morin wrote:
> Hi Peter,
>
> At the moment I have a pipeline based on Flink to wri
Hi,
When major compactions have been performed on Hive tables based on the Orc
format do we have Orc stripes that have been rewritten ? I know that records
have not been updated (ignored for some of ones but not updated) but concerning
Stripes size, do major compactions impact these ones ?
For
your question below: Yes, the files should be ordered by:
> originalTransacion, bucket, rowId triple, otherwise you will get wrong
> results.
>
> Thanks,
> Peter
>
> > On Nov 19, 2019, at 13:30, David Morin wrote:
> >
> > here after more detail
tid":3,"rowid":0} | *5218* |
| {"transactionid":11365,"bucketid":3,"rowid":1} | *5216* |
| {"transactionid":11369,"bucketid":3,"rowid":1} | *5216* |
| {"transactionid":11369,"bucketid":
Hello,
I'm trying to understand the purpose of the rowid column inside ORC delta
file
{"transactionid":11359,"bucketid":5,"*rowid*":0}
Orc view: {"operation":0,"originalTransaction":11359,"bucket":5,"*rowId*
":0,"currentTransaction":11359,"row":...}
I use HDP 2.6 => Hive 2
If I want to be
>
> Alan.
>
> On Mon, Sep 9, 2019 at 10:55 AM David Morin
> wrote:
>
>> Thanks Alan,
>>
>> When you say "you just can't have two simultaneous deletes in the same
>> partition", simultaneous means for the same transaction ?
>> If a create 2 "t
ive 3, where update and delete also take shared locks and
> a first committer wins strategy is employed instead.
>
> Alan.
>
> On Mon, Sep 9, 2019 at 8:29 AM David Morin
> wrote:
>
>> Hello,
>>
>> I use in production HDP 2.6.5 with Hive 2.1.0
>> We use t
Hello,
I use in production HDP 2.6.5 with Hive 2.1.0
We use transactional tables and we try to ingest data in a streaming way
(despite the fact we still use Hive 2)
I've read some docs but I would like some clarifications concerning the use of
Locks with transactional tables.
Do we have to use
, isn't it ?
Thus, this is a workaround but a little bit crappy.
But I'm open to any more suitable solution.
Le lun. 26 août 2019 à 08:51, David Morin a
écrit :
> Sorry, the same link in english:
> http://www.adaltas.com/en/2019/07/25/hive-3-features-tips-tricks/
>
> Le lun. 26 août 2019 à
Sorry, the same link in english:
http://www.adaltas.com/en/2019/07/25/hive-3-features-tips-tricks/
Le lun. 26 août 2019 à 08:35, David Morin a
écrit :
> Here after a link related to hive3:
> http://www.adaltas.com/fr/2019/07/25/hive-3-fonctionnalites-conseils-astuces/
> The author
août 2019 à 07:51, David Morin a
écrit :
> Hello,
> I've been trying "ALTER TABLE (table_name) COMPACT 'MAJOR'" on my Hive 2
> environment, but it always fails (HDP 2.6.5 precisely). It seems that the
> merged base file is created but the delta is not deleted.
> I
Hello,
I've been trying "ALTER TABLE (table_name) COMPACT 'MAJOR'" on my Hive 2
environment, but it always fails (HDP 2.6.5 precisely). It seems that the
merged base file is created but the delta is not deleted.
I found that it was because the HiveMetastore Client can't connect to the
metastore
Tue, Mar 12, 2019 at 12:24 PM David Morin
> wrote:
>
>> Thanks Alan.
>> Yes, the problem is fact was that this streaming API does not handle
>> update and delete.
>> I've used native Orc files and the next step I've planned to do is the
>> use of ACID support
this case, though it only handles insert (not update),
> so if you need updates you'd have to do the merge as you are currently
> doing.
>
> Alan.
>
> On Mon, Mar 11, 2019 at 2:09 PM David Morin
> wrote:
>
>> Hello,
>>
>> I've just implemented a pipeline ba
Hello,
I've just implemented a pipeline based on Apache Flink to synchronize
data between MySQL and Hive (transactional + bucketized) onto HDP
cluster. Flink jobs run on Yarn.
I've used Orc files but without ACID properties.
Then, we've created external tables on these hdfs directories that
Hi,
I've just implemented a pipeline to synchronize data between MySQL and Hive
(transactional + bucketized) onto HDP cluster.
I've used Orc files but without ACID properties.
Then, we've created external tables on these hdfs directories that contain
these delta Orc files.
Then, MERGE INTO
Hello,
I face to one error when I try to read my Orc files from Hive (external
table) or Pig or with hive --orcfiledump ..
These files are generated with Flink using the Orc Java API with Vectorize
column.
If I create these files locally (/tmp/...), push them to hdfs, then I can
read the content
25 matches
Mail list logo