I spent some time over the past couple of years making micro optimizations
within Avro, Parquet, ORC.
Curious to know if there's a way for you all to get timings at different
levels of the stack to compare and not just look at the top-line numbers. A
further breakdown could also help identify
Not directly. It relies on the underlying storage layer. For example:
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
On Tue, Mar 2, 2021 at 6:34 AM qq <987626...@qq.com> wrote:
> Hello:
>
> Does Hive support data encryption?
>
> Thank
Hello,
My hope has been that Hive 4.x would be built on Java 11. However, I've
hit many stumbling blocks over the past year towards this goal. I've been
able to make some progress, but several things are still stuck. It mostly
stems from the fact that hive has many big-ticket dependencies like
> you can easily create a new version.
> Is this the idea ?
>
> Br,
> Dennis
> --
> *Von:* David
> *Gesendet:* Samstag, 31. Oktober 2020 14:52:04
> *An:* user@hive.apache.org
> *Betreff:* Re: Hive Avro: Directly use of embedded Avro
What would your expectation be? That Hive reads the first file it finds
and uses that schema in the table definition?
What if the table is empty and a user attempts an INSERT? What should be
the behavior?
The real power of Avro is not so much that the schema can exist
(optionally) in the file
Hello Stephen,
Thanks for your interest. Can you please elaborate a bit more on your
question?
Thanks.
On Mon, Jul 27, 2020 at 4:11 PM Stephen Boesch wrote:
> Why would it be this way instead of the other way around?
>
> On Mon, 27 Jul 2020 at 12:27, David wrote:
>
>>
Hello Hive Users.
I am interested in gathering some feedback on the adoption of Hive-on-Spark.
Does anyone care to volunteer their usage information and would you be open
to removing it in favor of Hive-on-Tez in subsequent releases of Hive?
If you are on MapReduce still, would you be open to
Yes Peter, we're working on it. We try to make compaction work
automatically. With crontab otherwise.
Thanks for your help
David
Le mar. 2 juin 2020 à 14:48, Peter Vary a écrit :
> Hi David,
>
> Maybe this can help:
> https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_data-ac
. 2 juin 2020 à 12:57, Peter Vary a écrit :
> Hi David,
>
> You do not really need to run compaction every time.
> Is it possible to wait for the compaction to start automatically next time?
>
> Thanks,
> Peter
>
> On Jun 2, 2020, at 12:51, David Morin wrote:
>
>
Thanks Peter,
Any workaround on HDP 2.6.x with Hive 2 ?
Otherwise, the only way is to reduce time it takes for this "merge" queries
in order to cancel locks and related transactions. Am I right ?
Le mar. 2 juin 2020 à 11:52, Peter Vary a écrit :
> Hi David,
>
> I think this
compaction for the current database/table
On 2020/06/01 20:13:08, David Morin wrote:
> Hi,
>
> I have a compaction issue on my cluster. When I force a compaction (major) on
> one table I get this error in Metastore logs:
>
> 2020-06-01 19:49:35,512 ERROR [-78]: compactor.Compacto
Hi,
I have a compaction issue on my cluster. When I force a compaction (major) on
one table I get this error in Metastore logs:
2020-06-01 19:49:35,512 ERROR [-78]: compactor.CompactorMR
(CompactorMR.java:run(264)) - No delta files or original files found to compact
in
Thanks Ashutosh!
On Mon, Apr 13, 2020 at 12:27 PM Ashutosh Chauhan
wrote:
> Hi David,
> Added you to Hive wiki.
> Thanks,
> Ashutosh
>
> On Mon, Apr 13, 2020 at 6:39 AM David Mollitor wrote:
>
>> Hello Team,
>>
>> Is anyone able to grant me ac
Hello Team,
Is anyone able to grant me access to the Apache Hive Wiki (dmollitor) ?
Also, is there any discussion/interest in moving docs into the git repo?
Thanks!
Hi Anup,
I'm not that familiar yet with Hive's S3/Glacier-related capabilities, but
a quick search in both the code base and our jira project returned with
nothing in relation to glacier.
Regards,
David
On Tue, Mar 3, 2020 at 7:48 AM Anup Tiwari wrote:
> Hi Team,
>
> It will be reall
Hi Peter,
Just to give some news concerning my issue.
The problem is fixed. In fact, it was a reset of rowid in my application
because default batch size of my VectorizedRowBatch (ORC) is 1024
And during the reset of this batch, a reset of rowid was done.
By now it works as expected
Thanks
David
https://community.cloudera.com/t5/Support-Questions/Map-and-Reduce-Error-Java-heap-space/td-p/45874
On Fri, Feb 14, 2020, 6:58 PM David Mollitor wrote:
> Hive has many optimizations. One is that it will load the data directly
> from storage (HDFS) if it's a trivial query. For e
Hive has many optimizations. One is that it will load the data directly
from storage (HDFS) if it's a trivial query. For example:
Select * from table limit 10;
In natural language it says "give me any ten rows (if available) from the
table." You don't need the overhead of launching a full
ok, Peter
No problem. Thx
I'll keep you in touch
On 2020/02/06 09:42:39, Peter Vary wrote:
> Hi David,
>
> I more familiar with ACID v2 :(
> What I would do is to run an update operation with your version of Hive and
> try to see how it handles this case.
>
> Would be n
s to be good.
Probably a problem in the sort but I follow the rule that data are ordered by
originalTransaction,bucketId,rowId ascendingly and currentTransaction
descendingly. It works pretty well except for some tables with lot of updates.
The only thing I can see at the moment it is the fact that I
73_0199073_
hdfs:///delta_0199073_0199073_0002
And the first one contains updates (operation:1) and the second one, inserts
(operation:0)
Thanks for your help
David
On 2019/12/01 16:57:08, David Morin wrote:
> Hi Peter,
>
> At the moment I have a pipeline based on Flink to wri
In the beginning, hive was a command line tool. All the heavy lifting
happened on the user's local box. If a user wanted to execute hive from
their laptop, or a server, it always needs access to the list of available
tables (and their schemas and their locations), otherwise every SQL script
Hello,
Streaming? NiFi
Upserts? HBase, Kudu, Hive 3.x
Doing upserts on Hive can be cumbersome, depending on the use case. If
Upserts are being submitted continuously and quickly, it can overwhelm the
system because it will require a scan across the data set (for all intents
and purposes) for
?
For example with Orc files composed on small stripes, if I perform a major
compaction, can I expect to get new Orc files with bigger Stripes size ?
Thanks in advance
David
ate
to store Hive metadata (originalTransaction, bucket, rowId, ..)
Thanks for your reply because yes, when files are ordered by
originalTransacion, bucket, rowId
it works ! I just have to use 1 transaction instead of 2 at the moment and it
will be ok.
Thanks
David
On 2019/11/29 11:18:05, Peter V
tid":3,"rowid":0} | *5218* |
| {"transactionid":11365,"bucketid":3,"rowid":1} | *5216* |
| {"transactionid":11369,"bucketid":3,"rowid":1} | *5216* |
| {"transactionid":11369,"bucketid":
w transaction during the second
INSERT but that seems to generate duplicate records.
Regards,
David
Hello,
Not sure if this answers your question, but please note the following:
Processing occurs via MapReduce, Spark, or Tez. The processing engines run
on top of YARN. Each processing engine derives much of their HA from
YARN. There are some quarks there, but these engines running on YARN is
s data is deleted
and the new data is renamed/moved. Something to watch out for is if the query
returns no rows than the old data isn’t removed.
Thanks
Shawn
From: David M
Reply-To: "user@hive.apache.org"
Date: Wednesday, November 6, 2019 at 3:27 PM
To: "user@hive.apache.
a definitive answer on this? Pointers to the source
code or documentation that explains this would be even better.
Thanks!
David McGinnis
>
> Alan.
>
> On Mon, Sep 9, 2019 at 10:55 AM David Morin
> wrote:
>
>> Thanks Alan,
>>
>> When you say "you just can't have two simultaneous deletes in the same
>> partition", simultaneous means for the same transaction ?
>> If a create 2 "t
ive 3, where update and delete also take shared locks and
> a first committer wins strategy is employed instead.
>
> Alan.
>
> On Mon, Sep 9, 2019 at 8:29 AM David Morin
> wrote:
>
>> Hello,
>>
>> I use in production HDP 2.6.5 with Hive 2.1.0
>> We use t
for Insert (except original transaction = current)
4. commit transactions
Can we use Shared lock here ? Thus select queries can still be used
Thanks
David
, isn't it ?
Thus, this is a workaround but a little bit crappy.
But I'm open to any more suitable solution.
Le lun. 26 août 2019 à 08:51, David Morin a
écrit :
> Sorry, the same link in english:
> http://www.adaltas.com/en/2019/07/25/hive-3-features-tips-tricks/
>
> Le lun. 26 août 2019 à
Sorry, the same link in english:
http://www.adaltas.com/en/2019/07/25/hive-3-features-tips-tricks/
Le lun. 26 août 2019 à 08:35, David Morin a
écrit :
> Here after a link related to hive3:
> http://www.adaltas.com/fr/2019/07/25/hive-3-fonctionnalites-conseils-astuces/
> The author
août 2019 à 07:51, David Morin a
écrit :
> Hello,
> I've been trying "ALTER TABLE (table_name) COMPACT 'MAJOR'" on my Hive 2
> environment, but it always fails (HDP 2.6.5 precisely). It seems that the
> merged base file is created but the delta is not deleted.
> I
Hello,
I've been trying "ALTER TABLE (table_name) COMPACT 'MAJOR'" on my Hive 2
environment, but it always fails (HDP 2.6.5 precisely). It seems that the
merged base file is created but the delta is not deleted.
I found that it was because the HiveMetastore Client can't connect to the
metastore
based on the number of
files, but only if the files are located in S3. Can someone confirm this?
If this is the case, is there a JIRA tracking a fix, or documentation on why
this has to be this way?
If not, how can I make sure we use more mappers in cases like above?
Thanks!
David McGinnis
ble insertion you can use a syntax somewhat similar to VALUES
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingvaluesintotablesfromSQL
Kind Regards,
David
On Wed, Mar 27, 2019 at 12:40 AM Mainak Ghosh wrote:
> Hello,
>
> We want to create
Tue, Mar 12, 2019 at 12:24 PM David Morin
> wrote:
>
>> Thanks Alan.
>> Yes, the problem is fact was that this streaming API does not handle
>> update and delete.
>> I've used native Orc files and the next step I've planned to do is the
>> use of ACID support
this case, though it only handles insert (not update),
> so if you need updates you'd have to do the merge as you are currently
> doing.
>
> Alan.
>
> On Mon, Mar 11, 2019 at 2:09 PM David Morin
> wrote:
>
>> Hello,
>>
>> I've just implemented a pipeline ba
that contain
these delta Orc files.
Then, MERGE INTO queries are executed periodically to merge data into the
Hive target table.
It works pretty well but we want to avoid the use of these Merge queries.
How can I update Orc files directly from my Flink job ?
Thanks,
David
to get the valid
transaction for each table from Hive Metastore and, then, read all related
files.
Is it correct ?
Thanks,
David
Le dim. 10 mars 2019 à 01:45, Nicolas Paris a
écrit :
> Thanks Alan for the clarifications.
>
> Hive has made such improvements it has lost its ol
Hello,
I face to one error when I try to read my Orc files from Hive (external
table) or Pig or with hive --orcfiledump ..
These files are generated with Flink using the Orc Java API with Vectorize
column.
If I create these files locally (/tmp/...), push them to hdfs, then I can
read the content
I realized I mistyped my username My confluence username is mcginnisda. Please
give me write access to the Hive confluence wiki, or tell me where I need to
request it.
Thanks!
From: David M
Sent: Thursday, February 7, 2019 10:38 AM
To: user@hive.apache.org
Subject: Wiki Write Access
All
All,
I'd like to get wiki write access for the Apache Hive wiki, so I can update
some documentation based on a recent patch. My confluence name is mcginnda.
Thanks!
David McGinnis
Think bloom filter that's more dynamic. It works well when cardinality is
low, but grows quickly to out cost bloom filter as cardinality grows.
This data structure supports existence queries, but your email sounds like
you want count. If so not really the best fit.
On Dec 8, 2017 5:00 PM,
Our schema is nested with top level having 5 struct types. When we try to
query these structs we get the following back
*ORC does not support type conversion from file type string (1) to reader
type array (1)*
Walking through hive in a debugger I see that schema evolution sees the
correct file
in size.
>
> JDBC on its own should work. Is this an ORC table?
>
> What version of Hive are you using?
Kindly find the answer to these questions in my first eMail :)
>
> HTH
-David
>
>
>
>
>
> Dr Mich Talebzadeh
>
> LinkedIn
> https
help fix those codepaths as part of
> the joint effort with the ODBC driver teams.
I’ll see what I can do. I can’t restart the server at will though, since other
teams are using it as well.
>
> Cheers,
> Gopal
>
Thank you :)
-David
In my test case below, I’m using `beeline` as the Java application receiving
the JDBC stream. As I understand, this is the reference command line interface
to Hive. Are you saying that the reference command line interface is not
efficiently implemented? :)
-David Nies
> Am 20.06.2016 um 17
ork throughput?
Thank you in advance!
Yours
David Nies
Entwickler Business Intelligence
ADITION technologies AG
Oststraße 55, D-40211 Düsseldorf
Schwarzwaldstraße 78b, D-79117 Freiburg im Breisgau
T +49 211 987400 30
F +49 211 987400 33
E david.n...@adition.com <mailto:david.n...@aditi
Better use HCatalog for this.
David
Le 5 avr. 2016 10:14, "Mich Talebzadeh" <mich.talebza...@gmail.com> a
écrit :
> So you want to interrogate Hive metastore and get information about
> objects for a given schema/database in Hive.
>
> These info are kept in Hiv
Could always set the tables output format to be the null output format
On Mar 8, 2016 11:01 PM, "Jörn Franke" wrote:
> What is the use case? You can try security solutions such as Ranger or
> Sentry.
>
> As already mentioned another alternative could be a view.
>
> > On 08
Thanks, that should help moving forward
On Sep 3, 2015 10:38 AM, "Prasanth Jayachandran" <
pjayachand...@hortonworks.com> wrote:
>
> > On Sep 2, 2015, at 10:57 PM, David Capwell <dcapw...@gmail.com> wrote:
> >
> > So, very quickly looked at the JIRA a
Also, the data put in are primitives, structs (list), and arrays (list); we
don't use any of the boxed writables (like text).
On Sep 2, 2015 12:57 PM, "David Capwell" <dcapw...@gmail.com> wrote:
> We have multiple threads writing, but each thread works on one file, so
> orc
, Sep 2, 2015 at 7:34 PM, David Capwell <dcapw...@gmail.com> wrote:
> Thanks for the jira, will see if that works for us.
>
> On Sep 2, 2015 7:11 PM, "Prasanth Jayachandran"
> <pjayachand...@hortonworks.com> wrote:
>>
>> Memory manager is made thread local
ing for me, so no issue
sharding and not configuring?
Thanks for your time reading this email!
On Wed, Sep 2, 2015 at 8:57 PM, David Capwell <dcapw...@gmail.com> wrote:
> So, very quickly looked at the JIRA and I had the following question;
> if you have a pool per thread rather than global
s.memory). We may be missing a synchronization on the
> MemoryManager somewhere and thus be getting a race condition.
>
> Thanks,
>Owen
>
> On Wed, Sep 2, 2015 at 12:57 PM, David Capwell <dcapw...@gmail.com> wrote:
>
>> We have multiple threads writing, but each thread works o
-10191 and see if that helps?
>
> On Sep 2, 2015, at 8:58 PM, David Capwell <dcapw...@gmail.com> wrote:
>
> I'll try that out and see if it goes away (not seen this in the past 24
> hours, no code change).
>
> Doing this now means that I can't share the memory, so will
is that estimateStripeSize
won't always give the correct value since my thread is the one calling
it...
With everything ThreadLocal, the only writers would be the ones in the
same thread, so should be better.
On Wed, Sep 2, 2015 at 9:47 PM, David Capwell <dcapw...@gmail.com> wrote:
>
We are writing ORC files in our application for hive to consume.
Given enough time, we have noticed that writing causes a NPE when
working with a string column's stats. Not sure whats causing it on
our side yet since replaying the same data is just fine, it seems more
like this just happens over
You probably forgot to load (use) the module before calling new()
Le 6 août 2015 8:49 AM, siva kumar siva165...@gmail.com a écrit :
Hi David ,
I have tried the link you have posted. But im stuck
with this error message below
Can't locate object method new via package
that the data **is** in fact sorted...
If there is something specific you are trying to accomplish by specifying
the sort order of that column, perhaps you can elaborate on that.
Otherwise, leave out the 'sorted by' statement and you should be fine.
*From:* David Capwell [mailto:dcapw
.
*From:* David Capwell [mailto:dcapw...@gmail.com]
*Sent:* Monday, August 03, 2015 11:59 AM
*To:* user@hive.apache.org
*Subject:* RE: External sorted tables
Mostly wanted to tell hive it's sorted so it could use more efficient
joins like a map side join. No other reason
On Aug 3, 2015 10:47 AM
to insert data correctly by specifying the number of reducers to
be equal to the number of buckets, and using CLUSTER BY and SORT BY
commands in their query.
On Thu, Jul 30, 2015 at 7:22 PM, David Capwell dcapw...@gmail.com wrote:
We are trying to create a external table in hive. This data
Hive is not really meant to serve data as fast as a web page needs. You'll
have to use some intermediate (could even be a db file, or template toolkit
generated static pages).
David
Le 28 juil. 2015 8:53 AM, siva kumar siva165...@gmail.com a écrit :
Hi Lohith,
We use http
We are trying to create a external table in hive. This data is sorted,
so wanted to tell hive about this. When I do, it complains about
parsing the create.
CREATE EXTERNAL TABLE IF NOT EXISTS store.testing (
...
. . . . . . . . . . . . . . . . . . . timestamp bigint,
...)
. . . . . . . . . . .
/lib/Thrift/API/HiveClient2.pm
David
to interact with and query Hive through
the JDBC api from an application.
Thank you,
David McWhorter
—
David McWhorter
Senior Developer, Foundations
Informatics and Technology Services
Office: 434.260.5232 | Mobile: 434.227.2551
david_mcwhor...@premierinc.commailto:david_mcwhor...@premierinc.com
Ive had some troubles enabling transactions in Hive 1.0.0 and Ive made a post in
http://stackoverflow.com/questions/28867368/hive-transactions-are-crashing
Could anyone check it out and give me some pointers on why things are crashing?
Tyvm, Dave
David Novogrodsky
david.novogrod...@gmail.com
http://www.linkedin.com/in/davidnovogrodsky
STRING,
timeOfCall STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
input.regex = ([^\t]*)\t([^\t]*)\t([^\t]*)\t([^\t]*)\t([^\t]*)\n,
output.format.string = %1$s %2$s %3$s %4$s %5$s
)
LOCATION '/user/cloudera/vector/callRecords';
David
We are creating an Hive schema for reading massive JSON files. Our JSON schema
is rather large, and we have found that the default metastore schema for Hive
cannot work for us as-is.
To be specific, one field in our schema has about 17KB of nested structs within
it. Unfortunately, it appears
Hi Brad,
Your test, after edting for local host/file names, etc. worked. It
must be something else I'm doing wrong in my development stuff. At
least I know it should work. I'll figure it out eventually. Thanks
again.
David
On Mon, Apr 28, 2014 at 10:22:57AM -0700, Brad Ruderman wrote:
Hi
expects jar
file.jar to get passed to it. That's how it appears to work when
add jar file.jar is run from a stand-alone Hive CLI and from beeline.
David
On Sat, Apr 26, 2014 at 12:14:53AM -0700, Brad Ruderman wrote:
An easy solution would be to add the jar to the classpath or auxlibs
therefore
Hi all,
We have a few Hive UDFs where I work. These are deployed by a bootstrap
script so that the JAR files are in Hive's CLASSPATH before the server
starts.
This works to load the UDF whenever a cluster is started and then the UDF
can be loaded with the ADD JAR and CREATE TEMPORARY FUNCTION
and Beeline. It seems the add part of any add
file|jar|archive ... command needs to get stripped off somewhere
before it gets passed to AddResourceProcessor.run(). Unfortunately, I
can't find that location when the command is received from pyhs2. Can
someone help?
David
--
David Engel
da
Is it now the minimal required version ?
If not, will there be a Hive 0.13.1 for older hadoop?
Regards,
David
On Wed, Apr 23, 2014 at 4:00 PM, Dmitry Vasilenko dvasi...@gmail.comwrote:
Hive 0.12.0 (and previous versions) worked with Hadoop 0.20.x, 0.23.x.y,
1.x.y, 2.x.y.
Hive 0.13.0 did
,
Petter
2014-04-04 6:02 GMT+02:00 David Quigley dquigle...@gmail.com:
Thanks again Petter, the custom input format was exactly what I needed.
Here is example of my code in case anyone is interested
https://github.com/quicklyNotQuigley/nest
Basically gives you SQL access
Hi Narayanan,
We have had some success with a similar use case using a custom input
format / record reader to recursively split arbitrary json into a set of
discreet records at runtime. No schema is needed. Doing something similar
might give you the functionality you are looking for.
but nothing I saw
actually decomposes nested JSON into a set of discreet records. Its super
useful for us.
On Wed, Apr 2, 2014 at 2:15 AM, Petter von Dolwitz (Hem)
petter.von.dolw...@gmail.com wrote:
Hi David,
you can implement a custom InputFormat (extends
Makes perfect sense, thanks Petter!
On Wed, Apr 2, 2014 at 2:15 AM, Petter von Dolwitz (Hem)
petter.von.dolw...@gmail.com wrote:
Hi David,
you can implement a custom InputFormat (extends
org.apache.hadoop.mapred.FileInputFormat) accompanied by a custom
RecordReader (implements
We are currently streaming complex documents to hdfs with the hope of being
able to query. Each single document logically breaks down into a set of
individual records. In order to use Hive, we preprocess each input document
into a set of discreet records, which we save on HDFS and create an
and hiveserver1.
It fails with hiveserver 2.
Regards
David Gayou
On Thu, Feb 13, 2014 at 3:11 AM, Navis류승우 navis@nexr.com wrote:
With HIVE-3746, which will be included in hive-0.13, HiveServer2 takes
less memory than before.
Could you try it with the version in trunk?
2014-02-13 10:49 GMT+09
1. I have no process with hiveserver2 ...
ps -ef | grep -i hive return some pretty long command with a -Xmx8192
and that's the value set in hive-env.sh
2. The select * from table limit 1 or even 100 is working correctly.
David.
On Tue, Feb 18, 2014 at 4:16 PM, Stephen Sprague sprag
Sorry i badly reported it. It's 8192M
Thanks,
David.
Le 18 févr. 2014 18:37, Stephen Sprague sprag...@gmail.com a écrit :
oh. i just noticed the -Xmx value you reported.
there's no M or G after that number?? I'd like to see -Xmx8192M or
-Xmx8G. That *is* very important.
thanks,
Stephen
size)
My usecase is really to have the most possible columns.
Thanks a lot for your help
Regards
David
On Fri, Jan 31, 2014 at 1:12 AM, Edward Capriolo edlinuxg...@gmail.comwrote:
Ok here are the problem(s). Thrift has frame size limits, thrift has to
buffer rows into memory.
Hove
by row basis on those dataset, so
basically the more column we have the better it is.
We are coming from the SQL world, and Hive is the closest to SQL syntax.
We'd like to keep some SQL manipulation on the data.
Thanks for the Help,
Regards,
David Gayou
On Tue, Jan 28, 2014 at 8:35 PM, Stephen
03:19:42.726 |
2013-09-06 21:01:07.743 |
| spreadsheets2.google.com | 7 | 9 | 2013-09-06 03:19:42.726 |
2013-09-06 13:13:19.84 |
+++-+--+--+
David
--
David Engel
da...@istwok.net
On 26 Nov 2013, at 7:02, Sreenath wrote:
Hey David,
Thanks for the swift reply. Each id will have exactly one file. and
regarding the volume on an average each file would be 100MB of
compressed
data with the maximum going upto around 200MB compressed data.
And how will RC files
, and the average?
David
On 22 Nov 2013, at 9:35, Rok Kralj wrote:
If anybody has any clue what is the cause of this, I'd be happy to
hear it.
On Nov 21, 2013 9:59 PM, Rok Kralj rok.kr...@gmail.com wrote:
what does echo $HADOOP_HEAPSIZE return in the environment you're trying
to launch hive from?
David
outer join of table 1 on table 2.
you'd be able to identify quickly what went wrong. Sort the result so
you get unlikely dupes, and all. Just trial and error until you nail it.
David
in Hive (nothing more).
Can anyone confirm that behaviour?
David
)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
This is usually the case when your PK (on which Sqoop will try to do the
split) isn't an integer.
my 2c.
David
table.
David
It's relatively straight forward to call static functions in JDK using
reflect. For example,
select reflect(java.lang.Math, max, 2, 3) from mytable limit 1;
However, how do I use reflect to call non-static functions (e.g.,
indexOf() method in java.lang.String class)?
None of the following
On 7 Mar 2013, at 2:43, Murtaza Doctor wrote:
Folks,
Wanted to get some help or feedback from the community on this one:
Hello,
in that case it is advisable to start a new thread, and not 'reply-to'
when you compose your email :-)
Have a nice day
David
the
mapper, there is simply not enough memory available to it. Since the
compression scheme is BLOCK, I expected it would be possible to instruct
hive to process only a limited number of fragments instead of everything
that's in the file in 1 go.
David
1 - 100 of 129 matches
Mail list logo