Under sql/hive/src/main/scala/org/apache/spark/sql/hive/execution , I only
see HiveTableScan and HiveNativeCommand
At the beginning of HiveTableScan :
* The Hive table scan operator. Column and partition pruning are both
handled.
Looks like filter pushdown hasn't been implemented.
As far as I
Hi,
I have only encountered 'code too large' errors when changing grammars. I
am using SBT/Idea, no Eclipse.
The size of an ANTLR Parser/Lexer is dependent on the rules inside the
source grammar and the rules it depends on. So we should take a look at the
IdentifiersParser.g/ExpressionParser.g;
Thanks for the pointer. It seems to be really a pathological case, since
the file that's in error is part of the splinter file (the smaller one,
IndetifiersParser). I'll see if I can work around by splitting it some more.
iulian
On Thu, Jan 28, 2016 at 4:43 PM, Ted Yu
Thanks Ted ,I will try on this version.
-- 原始邮件 --
发件人: "Ted Yu";;
发送时间: 2016年1月28日(星期四) 晚上11:35
收件人: "开心延年";
抄送: "Jörn Franke"; "Julio Antonio Soto de
Vicente"; "Maciej
After this change:
[SPARK-12681] [SQL] split IdentifiersParser.g into two files
the biggest file under
sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser is
SparkSqlParser.g
Maybe split SparkSqlParser.g up as well ?
On Thu, Jan 28, 2016 at 5:21 AM, Iulian Dragoș
fileStream has a parameter "newFilesOnly". By default, it's true and means
processing only new files and ignore existing files in the directory. So
you need to ***move*** the files into the directory, otherwise it will
ignore existing files.
You can also set "newFilesOnly" to false. Then in the
HI All,
I am trying to run HdfsWordCount example from github.
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala
i am using ubuntu to run the program, but dont see any data getting printed
after ,
Hi,
I'm trying to run SQL query on Hive table which is stored on HBase.
I'm using:
- Spark 1.6.0
- HDP 2.2
- Hive 0.14.0
- HBase 0.98.4
I managed to configure working classpath, but I have following problems:
1) I have UDF defined in Hive Metastore (FUNCS table).
Spark cannot use it..
File
For the last two problems, hbase-site.xml seems not to be on classpath.
Once hbase-site.xml is put on classpath, you should be able to make progress.
Cheers
> On Jan 28, 2016, at 1:14 AM, Maciej Bryński wrote:
>
> Hi,
> I'm trying to run SQL query on Hive table which is
Hi all,
Could anyone provide pointers on how to extend the SPARK FPGrowth
implementation with either of the following stopping criteria:
* maximum number of generated itemsets,
* maximum length of generated itemsets (i.e. number of items in itemset).
The second criterion is e.g. available in
Hi,
Indeed, Hive is not able to perform predicate pushdown through a HBase table.
Nor Hive or Impala can.
Broadly speaking, if you need to query your HBase table through a field other
than de rowkey:
A) Try to "encode" as much info as possible in the rowkey field and use it as
your
we always used Sql like below.
select count(*) from ydb_example_shu where ydbpartion='20151110' and
(ydb_sex='' or ydb_province='LIAONING' or ydb_day>='20151217') limit 10
Spark don't push down predicates for TableScanDesc.FILTER_EXPR_CONF_STR, which
means that every query is full scan can`t
Probably a newer Hive version makes a lot of sense here - at least 1.2.1. What
storage format are you using?
I think the old Hive version had a bug where it always scanned all partitions
unless you limit it in the on clause of the query to a certain partition (eg on
date=20201119)
> On 28 Jan
Hi,
Has anyone seen this error?
The code of method specialStateTransition(int, IntStream) is exceeding
the 65535 bytes limitSparkSqlParser_IdentifiersParser.java:39907
The error is in ANTLR generated files and it’s (according to Stack
Overflow) due to state explosion in parser (or lexer).
If we support TableScanDesc.FILTER_EXPR_CONF_STR like hive
we may write sql LIKE this
select ydb_sex from ydb_example_shu where ydbpartion='20151110' limit 10
select ydb_sex from ydb_example_shu where ydbpartion='20151110' and
(ydb_sex='??' or ydb_province='' or ydb_day>='20151217') limit
Is there any body can solve Problem 4)? thanks.
Problem 4)
Spark don't push down predicates for HiveTableScan, which means that every
query is full scan.
-- --
??: "Julio Antonio Soto de Vicente";;
: 2016??1??28??(??)
This not hive`s bug .I test hive on my storage is ok.
but when i test it on spark-sql is not pass TableScanDesc.FILTER_EXPR_CONF_STR
params;
so that is the reason cause the full scan.
the source code in HiveHBaseTableInputFormat is as follows,that is the reason
caused full scan.
private
Ted,
You're right.
hbase-site.xml resolved problems 2 and 3, but...
Problem 4)
Spark don't push down predicates for HiveTableScan, which means that every
query is full scan.
== Physical Plan ==
TungstenAggregate(key=[],
functions=[(count(1),mode=Final,isDistinct=false)],
output=[count#144L])
+-
Dear spark
I am test StorageHandler on Spark-SQL.
but i find the TableScanDesc.FILTER_EXPR_CONF_STR is miss ,but i need it ,is
three any where i could found it?
I really want to get some filter information from Spark Sql, so that I could
make a pre filter by my Index ;
so where is the
Hi All,
I am trying to run a series of transformation over 3 DataFrames. After each
transformation, I want to persist DF and save it to text file. The steps I
am doing is as follows.
*Step0:*
Create DF1
Create DF2
Create DF3
Create DF4
(no persist no save yet)
*Step1:*
Create RESULT-DF1 by
Hey spark-devs,
I'm in the process of writing a DataSource for what is essentially a java
web service. Each relation which we create will consist of a series of
queries to this webservice which returns a pretty much known amount of data
(eg. 2000 rows, 5 string columns or similar which we can
21 matches
Mail list logo