are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
to be thrown from the HBase client itself.
On Thu, Jun 11, 2015 at 5:16 AM, Michael Segel
wrote:
threads?
So that regardless of your hadoop settings, if you want something
faster, you can use one thread for a timer and then the request is in
another. So if you hit your timeout before you
etc for any table created ?
Thanks,
Rahul
--
Thanks Regards,
Anil Gupta
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
time out instead of doing this I want the timeout or some exception
to be thrown from the HBase client itself.
On Thu, Jun 11, 2015 at 5:16 AM, Michael Segel michael_se...@hotmail.com
wrote:
threads?
So that regardless of your hadoop settings, if you want something faster,
you can use one
.
On Thu, Jun 11, 2015 at 5:16 AM, Michael Segel
wrote:
threads?
So that regardless of your hadoop settings, if you want something
faster, you can use one thread for a timer and then the request is in
another. So if you hit your timeout before you get a response, you can
stop your thread
TM == Trade Mark
On Jun 12, 2015, at 11:55 AM, hariharan_sethura...@dell.com
hariharan_sethura...@dell.com wrote:
The article starts with Apache HBase (TM)) - does is stand for Transaction
Manager?
Apache HBase (TM) is not an ACID compliant database
...
-Original Message-
When in doubt, printf() can be your friend.
Yeah its primitive (old school) but effective.
Then you will know what you’re adding to your list for sure.
On Jun 10, 2015, at 12:39 PM, beeshma r beeshm...@gmail.com wrote:
HI Devaraj
Thanks for your suggestion.
Yes i coded like this as
threads?
So that regardless of your hadoop settings, if you want something faster, you
can use one thread for a timer and then the request is in another. So if you
hit your timeout before you get a response, you can stop your thread.
(YMMV depending on side effects… )
On Jun 10, 2015, at
Well since you brought up coprocessors… lets talk about a lack of security and
stability that’s been introduced by coprocessors. ;-)
I’m not saying that you don’t want server side extensibility, but you need to
recognize the risks introduced by coprocessors.
On May 31, 2015, at 3:32 PM,
Saying Ambari rules is like saying that you like to drink MD 20/20 and calling
it a fine wine.
Sorry to all the Hortonworks guys but Amabari has a long way to go…. very
immature.
What that has to do with Cassandra vs HBase? I haven’t a clue.
The key issue is that unless you need or want to
not a stand alone product or system.
Hello, what is use case of a big data application w/o Hadoop?
-Vlad
On Mon, Jun 1, 2015 at 2:26 PM, Michael Segel michael_se...@hotmail.com
wrote:
Saying Ambari rules is like saying that you like to drink MD 20/20 and
calling it a fine wine.
Sorry
This is why I created HBASE-12853.
So you don’t have to specify a custom split policy.
Of course the simple solutions are often passed over because of NIH. ;-)
To be blunt… You encapsulate the bucketing code so that you have a single API
in to HBase regardless of the type of storage
Look, to be blunt, you’re screwed.
If I read your cluster spec.. it sounds like you have a single i7 (quad core)
cpu. That’s 4 cores or 8 threads.
Mirroring the OS is common practice.
Using the same drives for Hadoop… not so good, but once the sever boots up… not
so much I/O.
Its not good,
Why spring?
Why a DAO?
I’m not suggesting that using Spring or a DAO is wrong, however, you really
should justify it.
Since it looks like you’re trying to insert sensor data (based on the naming
convention), what’s the velocity of the inserts?
Are you manually flushing commits or are you
C’mon, really?
Do they really return the same results?
Let me put it this way… are you walking through the same code path?
On May 19, 2015, at 10:34 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:
Are not Scan and Gets supposed to be almost as fast?
I have a pretty small
Without knowing your exact configuration…
The High CPU may be WAIT IOs, which would mean that you’re cpu is waiting for
reads from the local disks.
What’s the ratio of cores (physical) to disks?
What type of disks are you using?
That’s going to be the most likely culprit.
On May 13,
version if I can find something.
cores / disks == 24 / 12 or 40 / 12.
We are using 10K sata drives on our datanodes.
Rahul
On Wed, May 13, 2015 at 10:00 AM, Michael Segel
michael_se...@hotmail.com
wrote:
Without knowing your exact configuration…
The High CPU may be WAIT IOs
Yeah, its about time.
What a slacker! :-P
On May 11, 2015, at 6:56 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:
This? http://shop.oreilly.com/product/0636920033943.do
2015-05-11 18:55 GMT-04:00 Michael Segel michael_se...@hotmail.com:
Why would you expect to have a region
at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
,
Arun
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
files.
I think that says it all.
Do you really want to open up your HBase snapshots to anyone?
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
working from different assumptions?
On Tue, May 5, 2015 at 4:46 PM, Michael Segel michael_se...@hotmail.com
wrote:
Yes, what you described mod(hash(rowkey),n) where n is the number of
regions will remove the hotspotting issue.
However, if your key is sequential you will only have regions
situation where I don't need range scans.
For example, let's say my key value is a person's last name. That will
naturally cluster around certain letters, giving me an uneven distribution.
--Jeremy
On Sun, May 3, 2015 at 11:46 AM, Michael Segel michael_se...@hotmail.com
wrote:
Yes
situation to retest it.
On Thu, Apr 30, 2015 at 3:56 PM Michael Segel
michael_se...@hotmail.com
wrote:
There is no single ‘right’ value.
As you pointed out… some of your Mapper.map() iterations are taking
longer
than 60 seconds.
The first thing is to determine why that happens
with this approach in future? First of all, Is this approach
correct?
Thanks,
Arun
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
I wouldn’t call storing attributes in separate columns a ‘rigid schema’.
You are correct that you could write your data as a CLOB/BLOB and store it in a
single cell.
The upside is that its more efficient.
The downside is that its really an all or nothing fetch and then you need to
write the
I would look at a different solution than HBase.
HBase works well because its tied closely to the HDFS and Hadoop ecosystem.
Going outside of this… too many headaches and you’d be better off with a NoSQL
engine like Cassandra or Riak, or something else.
On Apr 30, 2015, at 8:35 AM, Buğra
than ours was, but it may be helpful to
hear our experience with row key design
http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/video-hbasecon-2012-real-performance-gains-with-real-time-data.html
James
On Apr 30, 2015, at 7:51 AM, Michael Segel michael_se
cleaner (we
currently create a scan with our predicate for each bucket, and then push all
of those to MultiTableInputFormat).
Best,
Andrew
On 4/30/15 12:36 PM, Michael Segel wrote:
The downside
here is that you will lose your ability to perform range scans
The opinions expressed
at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
There is no single ‘right’ value.
As you pointed out… some of your Mapper.map() iterations are taking longer than
60 seconds.
The first thing is to determine why that happens. (It could be normal, or it
could be bad code on your developers part. We don’t know.)
The other thing is that if
if anyone knows of any related work in this area.
Thoughts and suggestions welcome.
Thanks,
Ayya
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
and thanks for the tip! :-)
On Wed, Apr 8, 2015 at 1:45 PM, Michael Segel michael_se...@hotmail.com
wrote:
Ok…
First, I’d suggest you rethink your schema by adding an additional
dimension.
You’ll end up with more rows, but a narrower table.
In terms of compaction… if the data
apurt...@apache.org
To: user@hbase.apache.org user@hbase.apache.org
Sent: Thursday, April 9, 2015 4:53 PM
Subject: Re: Rowkey design question
On Thu, Apr 9, 2015 at 2:26 PM, Michael Segel michael_se...@hotmail.com
wrote:
Hint: You could have sandboxed the end user code which makes it a lot
that would do
the same calculation on its own.
On Thu, Apr 9, 2015 at 4:43 AM, Michael Segel michael_se...@hotmail.com
wrote:
When you say coprocessor, do you mean HBase coprocessors or do you mean a
physical hardware coprocessor?
In terms of queries…
HBase can perform a single get
.
On Thu, Apr 9, 2015 at 5:05 AM, Michael Segel michael_se...@hotmail.com
wrote:
Ok…
Coprocessors are poorly implemented in HBase.
If you work in a secure environment, outside of the system coprocessors…
(ones that you load from hbase-site.xml) , you don’t want to use them. (The
coprocessor code
unaware of?
On Wed, Apr 8, 2015 at 7:43 PM, Michael Segel michael_se...@hotmail.com
wrote:
I think you misunderstood.
The suggestion was to put the data in to HDFS sequence files and to use
HBase to store an index in to the file. (URL to the file, then offset in to
the file
at 4:41 AM, Michael Segel michael_se...@hotmail.com
wrote:
Is your table staic?
If you know your data and your ranges, you can do it. However as you add
data to the table, those regions will eventually split.
The other issue that you brought up is that you want to do ‘local’ joins
for your the problem you are trying to solve
is
HBASE-10576 by tweaking it a little.
cheers,
esteban.
--
Cloudera, Inc.
On Wed, Apr 8, 2015 at 4:41 AM, Michael Segel
michael_se...@hotmail.com
wrote:
Is your table staic?
If you know your data and your ranges, you can
column qualifier.
Yes, this is not possible if HBase loads the whole 500MB each time i
want
to perform this custom query on a row. Hence my question :-)
On Tue, Apr 7, 2015 at 11:03 PM, Michael Segel
michael_se...@hotmail.com
wrote:
Sorry, but your initial problem statement
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
actual values (bigger qualifiers)
outside HBase. Keeping them in Hadoop why not? Pulling hot ones out on SSD
caches would be an interesting solution. And quite a bit simpler.
Good call and thanks for the tip! :-)
On Wed, Apr 8, 2015 at 1:45 PM, Michael Segel michael_se...@hotmail.com
wrote
for a while?
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
into a direct ByteBuffer) ?
Cheers,
-Kristoffer
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
, Including
some insert and some update, this scenario is appropriate to use hbase to
build data worehose?
2) Is there some case about Enterprise BI Solutions with HBASE?
thanks.
Regards,
Ben Liang
On Apr 6, 2015, at 20:27, Michael Segel michael_se...@hotmail.com wrote:
Yeah
loading every day , Including
some insert and some update, this scenario is appropriate to use hbase to
build data worehose?
2) Is there some case about Enterprise BI Solutions with HBASE?
thanks.
Regards,
Ben Liang
On Apr 6, 2015, at 20:27, Michael Segel michael_se
balancer
needs to be run, especially in multi-tenant clusters with archive data. It
is best to immediately run a major compaction to restore HBase locality if
the HDFS balancer is used.
On Mon, Mar 23, 2015 at 10:50 AM, Michael Segel michael_se...@hotmail.com
wrote:
@lars,
How does
used to represent a best practice. In many cases the HDFS
balancer needs to be run, especially in multi-tenant clusters
with archive data. It is best to immediately run a major compaction to
restore HBase locality if the HDFS balancer is used.
On Mon, Mar 23, 2015 at 10:50 AM, Michael Segel
the other blocks except the final one had a size of 67108864 as well. HDFS
considered both versions of the block to be corrupt, but at one point I did
replace the truncated data on the one node with the full-length data (to no
avail).
-md
On Thu, Mar 19, 2015 at 6:49 PM, Michael Segel
the volumes will be filled. This even though legacy nodes have 5 volumes
and total storage of 5X TB.
Fact or fantasy?
Thanks,
Ted
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
are currently on HBase 0.98.6 (CDH 5.3.0)
Thanks,
Abe
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
, Michael Segel michael_se...@hotmail.com
wrote:
Hi,
I’m trying to understand your problem.
You pre-split your regions to help with some load balancing on the load.
Ok.
So how did you calculate the number of regions to pre-split?
You said that the number of regions has grown. How were
Copy the table, drop original, rename copy.
On Mar 19, 2015, at 3:46 AM, Pankaj kr pankaj...@huawei.com wrote:
Thanks for the reply Ashish.
I can set EMPTY or NONE value using alter command.
alter 't1', {NAME = 'cf1', ENCRYPTION = ''}
alter 't1', {NAME = 'cf1', ENCRYPTION =
,
Ted
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
back. - Piet
Hein
(via Tom White)
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
- a means of creating splits based on
regions, without having to iterate over all rows in the table through the
client API. Do you have any idea how I might achieve this?
Thanks,
On Tuesday, March 17, 2015, Michael Segel michael_se...@hotmail.com
wrote:
Hbase doesn't have partitions. It has
, 2015, at 3:44 PM, Sean Busbey bus...@cloudera.com wrote:
On Fri, Mar 13, 2015 at 2:41 PM, Michael Segel michael_se...@hotmail.com
wrote:
In stand alone, you’re writing to local disk. You lose the disk you lose
the data, unless of course you’ve raided your drives.
Then when you lose
--
Abraham Tom
Email: work2m...@gmail.com
Phone: 415-515-3621
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
On 3/13/15, 1:46 PM, Michael Segel michael_se...@hotmail.com
mailto:michael_se...@hotmail.com wrote:
Guys,
More than just needing some love.
No HDFS… means data at risk.
No HDFS… means that stand alone will have security issues.
Patient Data? HINT: HIPPA.
Please think your design
reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
install the thrift server locally on every C++ client
machine? I'd imagine performance should be similar to native java
performance at that point.
-Mike
On Sat, Mar 7, 2015 at 4:49 PM, Michael Segel michael_se...@hotmail.com
wrote:
Or you could try a java connection wrapped by JNI so you
don't expect any significant
difference between Thrift(C++) and Java. Any ideas? Many thanks
Demai
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
The better answer is that you don’t worry about data locality.
Its becoming a moot point.
On Mar 4, 2015, at 12:32 PM, Andrew Purtell apurt...@apache.org wrote:
Spark supports creating RDDs using Hadoop input and output formats (
)
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
, it dumps them to a directory in hdfs.
--
Sean
The opinions expressed here are mine, while they may reflect a cognitive
thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:
We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
2.0.0-cdh4.6.0).
For all of our tables, we set the replication factor to 1 (dfs.replication
= 1 in hbase-site.xml). We set to 1 because we want to minimize the
Hi,
Yes you would want to start your key by user_id.
But you don’t need the timestamp. The user_id + alert_id should be enough on
the key.
If you want to get fancy…
If your alert_id is not a number, you could use the EPOCH - Timestamp as a way
to invert the order of the alerts so that the
, Feb 23, 2015 at 12:25 PM, Michael Segel mse...@segel.com wrote:
On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:
We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
2.0.0-cdh4.6.0).
For all of our tables, we set the replication factor to 1 (dfs.replication
= 1
Yes and no.
Its a bit more complicated and it is also data dependent and how you’re using
the data.
I wouldn’t go too thin and I wouldn’t go to fat.
On Feb 20, 2015, at 2:19 PM, Alok Singh aloksi...@gmail.com wrote:
You don't want a lot of columns in a write heavy table. HBase stores
@Ted,
Pseudo cluster on a machine that has 4GB of memory.
If you give HBase 1.5GB for the region server… you are left with 2.5 GB of
memory for everything else.
You will swap.
In short, nothing he can do will help. He’s screwed if he is trying to look
improving performance.
On Jan 11,
Storing aggregates on its own? No.
Storing aggregates of a data set that is the primary target? Sure. Why not?
On Jan 9, 2015, at 9:00 PM, Buntu Dev buntu...@gmail.com wrote:
I got a CDH cluster with data being ingested via Flume to store in HDFS as
Avro. Currently, I query the dataset using
Guys,
You have two issues.
1) Physical structure and organization.
2) Logical organization and data usage.
This goes to the question of your data access pattern and use case.
The best example of how to use Column Families that I can think of is an order
entry system.
Here you would have
… It was
a mess so we never looked back. And of course the client was/is a java shop.
So Java is the first choice.
Just my $0.02 cents
On Dec 1, 2014, at 2:41 PM, Aleks Laz al-userhb...@none.at wrote:
Dear Michael.
Am 29-11-2014 23:49, schrieb Michael Segel:
Guys, KISS.
You can use
Hi,
Lets take a step back… OP’s initial goal is to replace all of the fields/cells
on a row at the same time.
Thought about doing a delete prior to the put().
Is now a good time to remind people about what happens during a delete and how
things can happen out of order?
And should we talk
Not sure of the question.
A scan will return multiple rows in sequential order. Note that its sequential
byte stream order.
The columns will also be in sequential order as well…
So if you have a set of column named as ‘foo’+timestamp then for each column in
the set of foo, it will be in
St.Ack,
I think you're side stepping the issue concerning schema design.
Since HBase isn't my core focus, I also have to ask since when has heap sizes
over 16GB been the norm?
(Really 8GB seems to be quite a large heap size... )
On Oct 31, 2014, at 11:15 AM, Stack st...@duboce.net wrote:
rows with less versions each, instead of
these fat rows. While not exactly the same, you might be able to use TTL
or your own purge job to keep the number of rows limited.
On Mon, Nov 3, 2014 at 2:02 PM, Michael Segel mse...@segel.com wrote:
St.Ack,
I think you're side stepping the issue
Here’s the simple answer.
Don’t do it.
They way you are abusing versioning is a bad design.
Redesign your schema.
On Oct 30, 2014, at 10:20 AM, Andrejs Dubovskis dubis...@gmail.com wrote:
Hi!
We have a bunch of rows on HBase which store varying sizes of data
(1-50MB). We use HBase
is upgrade the coprocessor
in the Standby and then swap the clusters. But since you would have to
stand up a second HBase cluster, this may be a non-starter for you. Just
another option thrown into the mix. :)
On Wed Oct 29 2014 at 12:07:02 PM Michael Segel mse...@segel.com wrote:
Well you
Well you could redesign your cp.
There is a way to work around the issue by creating a cp that's really a
framework and then manage the cps in a different jvm(s) using messaging between
the two.
So if you want to reload or restart your cp, you can do it outside of the RS.
Its a bit more
OP wants to know good use cases where to use ttl setting.
Answer: Any situation where the cost of retaining the data exceeds the value to
be gained from the data. Using ttl allows for automatic purging of data.
Answer2: Any situation where you have to enforce specific retention policies
You need to create two sets of Hadoop configurations and deploy them to the
correct nodes.
Yarn was supposed to be the way to heterogenous clusters.
But this begs the question. Why on earth did you have a 32 bit cluster to begin
with?
On Sep 16, 2014, at 1:13 AM, Esteban Gutierrez
. and this would again be a different discussion.)
HTH
-Mike
On Sep 10, 2014, at 10:25 PM, Wilm Schumacher wilm.schumac...@cawoom.com
wrote:
Am 10.09.2014 um 22:25 schrieb Michael Segel:
Ok, but here’s the thing… you extrapolate the design out… each column
with a subordinate record
Lets take a step back….
Your parallel scan is having the client create N threads where in each thread,
you’re doing a partial scan of the table where each partial scan takes the
first and last row of each region?
Is that correct?
On Sep 12, 2014, at 7:36 AM, Guillermo Ortiz
) {
results.add(result);
}
connection.close();
table.close();
return results;
}
They implement Callable.
2014-09-12 9:26 GMT+02:00 Michael Segel michael_se...@hotmail.com:
Lets take a step back….
Your parallel scan is having the client create N
and they could compete for resources (network, etc..) on this node. It'd be
better to have one thread for RS. But, that doesn't answer your questions.
I keep thinking...
2014-09-12 9:40 GMT+02:00 Michael Segel michael_se...@hotmail.com:
Hi,
I wanted to take a step back from the actual code
14:48 GMT+02:00 Michael Segel michael_se...@hotmail.com:
Ok, lets again take a step back…
So you are comparing your partial scan(s) against a full table scan?
If I understood your question, you launch 3 partial scans where you set
the start row and then end row of each scan, right
Because you really don’t want to do that since you need to keep the number of
CFs low.
Again, you can store the data within the structure and index it.
On Sep 10, 2014, at 7:17 AM, Wilm Schumacher wilm.schumac...@cawoom.com wrote:
as stated above you can use JSON or something similar, which
wrote:
Am 10.09.2014 um 17:33 schrieb Michael Segel:
Because you really don’t want to do that since you need to keep the number
of CFs low.
in my example the number of CFs is 1. So this is not a problem.
Best wishes,
Wilm
You do realize that everything you store in Hbase are byte arrays, right? That
is each cell is a blob.
So you have the ability to create nested structures like… JSON records? ;-)
So to your point. You can have a column A which represents a set of values.
This is one reason why you shouldn’t
So you have large RS and you have large regions. Your regions are huge relative
to your RS memory heap.
(Not ideal.)
You have slow drives (5400rpm) and you have 1GbE network.
Do didn’t say how many drives per server.
Under load, you will saturate your network with just 4 drives. (Give or
that.
With the setLoadColumnFamiliesOnDemand I learned from Ted, looks like the
performance should be similar.
Am I missing something? Please enlighten me.
Jianshi
On Mon, Sep 8, 2014 at 3:41 AM, Michael Segel michael_se...@hotmail.com
wrote:
I would suggest rethinking column families
that determination after having carefully considered the extent of the
mismatch.
2014-09-09 13:37 GMT-07:00 Michael Segel michael_se...@hotmail.com:
You do realize that everything you store in Hbase are byte arrays, right?
That is each cell is a blob.
So you have the ability to create
is mostly in mapreduce jobs.
Jianshi
On Sun, Sep 7, 2014 at 4:52 AM, Michael Segel michael_se...@hotmail.com
wrote:
Again, a silly question.
Why are you using column families?
Just to play devil’s advocate in terms of design, why are you not treating
your row as a record
Again, a silly question.
Why are you using column families?
Just to play devil’s advocate in terms of design, why are you not treating your
row as a record? Think hierarchal not relational.
This really gets in to some design theory.
Think Column Family as a way to group data that has the
What type of drives. controllers, and network bandwidth do you have?
Just curious.
On Sep 6, 2014, at 7:37 PM, kiran kiran.sarvabho...@gmail.com wrote:
Also the hbase version is 0.94.1
On Sun, Sep 7, 2014 at 12:00 AM, kiran kiran.sarvabho...@gmail.com wrote:
Lars,
We are facing a
1 - 100 of 583 matches
Mail list logo