Thank you for clarification. What the performance implications for search
queries of using HDFS vs. Kudu if storing large datasets ( ~10,000 records)
per table? Does storing large datasets in Kudu improve search performance?
Thanks again,
-V

On Fri, Dec 6, 2019 at 12:51 PM Thomas Tauber-Marshall <
tmarsh...@cloudera.com> wrote:

> Yes, you can use Impala to run queries against data in HDFS. Kudu is not
> required.
>
> By default, new tables will be created for HDFS. To create Kudu tables, or
> control the file format that data is saved in HDFS for the table as, you
> can use the "STORED AS" clause with CREATE TABLE. To control where in HDFS
> the data is stored, you can use the LOCATION clause with CREATE TABLE. To
> query data that is already in HDFS (rather than creating a new, empty
> table) you can use EXTERNAL and LOCATION with CREATE TABLE.
>
> There are a bunch more details in the documentation:
> http://impala.apache.org/docs/build/html/topics/impala_create_table.html
>
>
> On Fri, Dec 6, 2019 at 9:43 AM l vic <lvic4...@gmail.com> wrote:
>
>> After first look at documentation and tutorial i am still confused with
>> how to use/ configure storage backend for impala... Can I use impala sql to
>> run queries against data in hdfs, or do i need backend data server like
>> "kudu"? How to specify data storage in "create table" statement?
>> Thank you,
>> -V
>>
>

Reply via email to