Brave and wise answer :)


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 15 August 2016 at 12:24, Gourav Sengupta <gourav.sengu...@gmail.com>
wrote:

> I think that I have scratched a hornet's nest here. If you are comfortable
> mentioning faster way to access data as indexes then its fine. And everyone
> is and in the foreseeable future going to continue to use indexes.
>
> When I think about reaching data faster, I just refer to the methods
> available currently as algorithms.
>
>
> Regards,
> Gourav
>
> On Mon, Aug 15, 2016 at 11:59 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> My two cents
>>
>> Indexes on any form and shape are there to speed up the query whether it
>> is classical index (B-tree), store-index (data and stats stored together),
>> like Oracle Exalytics, SAP Hana, Hive ORC tables or in-memory databases
>> (hash index). Indexes are there to speed up the access path in some form
>> and shape.
>>
>> The issue with indexes on Big data is that HDFS lacks the ability to
>> co-locate blocks, so that is a bit of a challenge and may be one of the
>> reasons that indexes are not as common in Big Data world as others.
>> However, that is changing. Bottom line it sounds like Big Data has to
>> perform on par with a transaction database. in retrieving the queries and
>> very fast access path.
>>
>> HTH
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 15 August 2016 at 11:19, u...@moosheimer.com <u...@moosheimer.com>
>> wrote:
>>
>>> So you mean HBase, Cassandra, Hana, Elasticsearch and so on do not use
>>> idexes?
>>> There might be some very interesting new concepts I've missed?
>>>
>>> Could you be more precise?
>>>
>>> ;-)
>>>
>>> Regards,
>>> Uwe
>>>
>>>
>>>
>>> Am 15.08.2016 um 11:59 schrieb Gourav Sengupta:
>>>
>>> The world has moved in from indexes, materialized views, and other
>>> single processor non-distributed system algorithms. Nice that you are not
>>> asking questions regarding hierarchical file systems.
>>>
>>>
>>> Regards,
>>> Gourav
>>>
>>> On Sun, Aug 14, 2016 at 4:03 AM, Taotao.Li <charles.up...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> hi, guys, does Spark SQL support indexes?  if so, how can I create an
>>>> index on my temp table? if not, how can I handle some specific queries on a
>>>> very large table? it would iterate all the table even though all I want is
>>>> just a small piece of that table.
>>>>
>>>> great thanks,
>>>>
>>>>
>>>> *___________________*
>>>> Quant | Engineer | Boy
>>>> *___________________*
>>>> *blog*:    http://litaotao.github.io
>>>> <http://litaotao.github.io/?utm_source=spark_mail>
>>>> *github*: www.github.com/litaotao
>>>>
>>>>
>>>>
>>>
>>>
>>
>

Reply via email to