Just a thought, will forcing a major compaction take care of this? Merging
smaller tablets and deleting empty ones?
Best regards,
Yamini Joshi
On Mon, Jan 16, 2017 at 4:31 PM, Dickson, Matt MR <
matt.dick...@defence.gov.au> wrote:
> *UNOFFICIAL*
> I have a table that has evolved t
, after any record is passed through the
> complete Iterator "stack" (never only some of the iterators), may choose to
> flush the queue entries back to the client.
>
> Yamini Joshi wrote:
>
>> I see. So, foa a scan opertaion that span 2 tservers: the client knows
>>
t; it2 -> it3 -> client (if the max limit is reached) or is
it always at the the end of the pipeline?
Best regards,
Yamini Joshi
On Tue, Nov 22, 2016 at 12:36 PM, Josh Elser <josh.el...@gmail.com> wrote:
> Scanners are sequentially communicating with TabletServers, as opposed to
>
gt; it2 on tserver -> it3 on tserver
-> client
The processing is done in batches?
Data is returned to the client when it reaches the max limit for
table.scan.max.memory even if it is in the middle of the pipeline above?
Best regards,
Yamini Joshi
On Tue, Nov 22, 2016 at 11:56 AM, Christ
have some doubts:
1. Where is the data from tserver1 and tserver2 merged?
2. when and how are custom iterators applied?
Also, if there is any resource explaining this, please point me to it. I've
found some slides but no detailed explanation.
Best regards,
Yamini Joshi
I figured the same but forgot to factor in the HDFS part.
Best regards,
Yamini Joshi
On Sat, Nov 19, 2016 at 6:13 PM, Josh Elser <josh.el...@gmail.com> wrote:
> Hi Yamini,
>
> I'd just add one word of caution about knowing exactly what you're
> trying to measure. For example,
Hello all
I am trying to track performance of my queries on Accumulo.I need to clear
cache before every query in order to get clean time values. Could anyone
please tell me how could this be achieved?
Best regards,
Yamini Joshi
Hello all
Does the HDFS replication improve performance of queries on Accumulo or is
it transparent to the Accumulo system? If it does improve the performance
by some notion of load balancing, is there is a Read Only or Write Only
copy of data on HDFS for Accumulo?
Best regards,
Yamini Joshi
way to go about it?
Best regards,
Yamini Joshi
regards,
Yamini Joshi
from batch_scan before passing the
data to other iterators?
Best regards,
Yamini Joshi
Thank you for the reply! I'll try this and get back to you. Also, I found a
MultiIterator Class. Any ideas on how it works? Will it work with batch
scan and sort data before passing it to other iterators?
Best regards,
Yamini Joshi
On Fri, Oct 21, 2016 at 6:35 AM, <dlmar...@comcast.net>
in which belong the list
cardinality(Y intersection C)
Best regards,
Yamini Joshi
On Thu, Oct 20, 2016 at 7:16 PM, Dave <dlmar...@comcast.net> wrote:
> I'm a little confused to the use case here. Are you trying to find courses
> that students are taking where the students are in a part
ored as rows, or otherwise moving the columns into the
> rows.
>
> Regards, Dylan
>
> On Thu, Oct 20, 2016 at 3:45 PM, Yamini Joshi <yamini.1...@gmail.com>
> wrote:
>
>> Hello all
>>
>> Is it possible to configure an iterator that works as a filter? As per
&g
I will take a look at it. Thanks Josh :)
Best regards,
Yamini Joshi
On Thu, Oct 20, 2016 at 5:30 PM, Josh Elser <josh.el...@gmail.com> wrote:
> You can do a partial summation in an Iterator, but managing memory
> pressure (like you originally pointed out) would require
in iterator and go to the range in the
list of cf and check if it exists. I am not sure if this will work or if it
is a good approach. Any feedback is much appreciated.
Best regards,
Yamini Joshi
of
parameters.
I am back to square one. But I guess if there is no other option, I will
try to benchmark and keep you guys in the loop :)
Best regards,
Yamini Joshi
On Thu, Oct 20, 2016 at 4:22 PM, Josh Elser <josh.el...@gmail.com> wrote:
> I would like to inject some hesitation here. This i
Alright! Do you happen to have some reference code that I can refer to? I
am a newbie and I am not sure if by caching, aggregating and merge sort you
mean to use some Accumulo wrapper or write a simple java code.
Best regards,
Yamini Joshi
On Thu, Oct 20, 2016 at 2:49 PM, ivan bella &l
might need to
generate new keys with columnfamily name as the key and count as the value.
Best regards,
Yamini Joshi
not return records in a sorted manner hence step 4
does not give me the required results :\
I am not sure how to proceed now.
Best regards,
Yamini Joshi
On Mon, Sep 26, 2016 at 8:28 AM, Josh Elser <josh.el...@gmail.com> wrote:
> I think I can understand what your query is doing, but, I'm just
Hello all
My keys are of the form rowID:otherID where there are multiple otherIDs for
a RowID. I want to know the count of all the otherIDs within a rowID. What
would be the most optimal way to implement this?
Best regards,
Yamini Joshi
In other words, what helps in load balancing? HDFS replication or Data
center replication?
Best regards,
Yamini Joshi
On Sat, Oct 15, 2016 at 10:44 PM, Yamini Joshi <yamini.1...@gmail.com>
wrote:
> So HDFS is for durability while replication is for availability? I'm
> assuming tha
So HDFS is for durability while replication is for availability? I'm
assuming that the client is unaware of the replicated instance and queries
the DB with no knowledge of which instance/table will return the result.
Best regards,
Yamini Joshi
On Thu, Oct 13, 2016 at 11:46 AM, Josh Elser
So, can I say that if I have a table split across nodes (i.e. num tablets >
1) and HDFS replication in my system, it is sort of equivalent to a sharded
and replicated mongo architecture?
Best regards,
Yamini Joshi
On Thu, Oct 13, 2016 at 11:06 AM, Josh Elser <josh.el...@gmail.com&
this replication conf and the
replication on HDFS level. What exactly is the use case for replication?
Are the replicated instances visible to the clients?
Best regards,
Yamini Joshi
Alright. I'll keep that in mind. The next step for me will be to import
data from 90G Bson files. I think that'll be a good start for bulk import.
Best regards,
Yamini Joshi
On Tue, Oct 11, 2016 at 10:14 PM, Josh Elser <josh.el...@gmail.com> wrote:
> Even 10G is a rather small amoun
,
Yamini Joshi
Thanks everyone for help. It is working now. I had to edit some memory
confs and do a clean install. Also, the /tracers znode is created after the
acccumulo is started (i.e. start-all.sh) and not init.
Best regards,
Yamini Joshi
On Fri, Oct 7, 2016 at 12:12 PM, Josh Elser <josh.el...@gmail.
,
Yamini Joshi
On Mon, Oct 10, 2016 at 5:09 AM, vaibhav thapliyal <
vaibhav.thapliyal...@gmail.com> wrote:
> Creating an Inverted Index could serve your use case. You can store the
> column family and column qualifier both in the row of the index table
> separated by a delimiter.
&
I can see that in my local setup in my laptop. But, I can't see it here
somehow. Idk what exactly is wrong.
Best regards,
Yamini Joshi
On Fri, Oct 7, 2016 at 11:00 AM, Josh Elser <josh.el...@gmail.com> wrote:
> It should be generated at /tracers when the Accumulo Tracer i
I don't understand why the tracer node is not generated at all.
Best regards,
Yamini Joshi
On Fri, Oct 7, 2016 at 10:19 AM, Yamini Joshi <yamini.1...@gmail.com> wrote:
> So the file structure inside zookeeper(now after formatting zookeepers) is:
> Accumulo
>
>- d61d7
- fate
- tservers
- tables
- replication
- next_file
- config
- bulk_failed_copyq
- dead
- masters
- instances
- test
test is the name of my new instance. Yes I reinitialized accumulo using
/bin/accumulo init
Best regards,
Yamini Joshi
On Fri
)
at
org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:80)
at
org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151)
... 6 more
Best regards,
Yamini Joshi
On Fri, Oct 7, 2016 at 10:08 AM, Sean Busbey <bus...@cloudera.com> wrote:
> tracers used to be under the in
1.7.2
Best regards,
Yamini Joshi
On Thu, Oct 6, 2016 at 4:17 PM, Josh Elser <josh.el...@gmail.com> wrote:
> Hrm, maybe I am looking at a newer version of Accumulo than what you're
> using. What version are you on?
>
> Yamini Joshi wrote:
>
>> Thank you for rep
this up? I am attaching my config files here (Rest all
the same generated as a result of bin_config file).
Best regards,
Yamini Joshi
accumulo-env.sh
Description: Bourne shell script
instance.volumes
hdfs://m4:9000/accumulo
comma separated list of URIs for volumes. example
Alright. Thanks :)
Best regards,
Yamini Joshi
On Fri, Sep 30, 2016 at 1:10 PM, Brian Loss <bfl...@praxiseng.com> wrote:
> That’s true for the row. For the other parts of the key, it can be done
> under the right circumstances.
>
> On Sep 30, 2016, at 2:05 PM, Yamini Joshi <
If I give it an empty range, it gives me the output of simple scan(without
the iterator applied even though the iterator is working). I guess it's bad
to modify keys within an iterator.
Best regards,
Yamini Joshi
On Fri, Sep 30, 2016 at 12:51 PM, Dan Blum <db...@bbn.com> wrote:
> Wha
regards,
Yamini Joshi
On Fri, Sep 30, 2016 at 12:31 PM, Dan Blum <db...@bbn.com> wrote:
> What code are you using to test the iterator, where you see no output?
>
>
>
> *From:* Yamini Joshi [mailto:yamini.1...@gmail.com]
> *Sent:* Friday, September 30, 2016 1:26 PM
> *To
ange, columnFamilies, inclusive);
next();
}
@Override
public Key getTopKey() {
return key;
}
@Override
public Value getTopValue() {
return value;
}
@Override
public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment
env) {
return null;
}
}
Best regards,
Yamini Joshi
looking for
an optimal solution (since filter might scan the entire database).
Best regards,
Yamini Joshi
o's built-in Combiner iterators
> <https://accumulo.apache.org/1.8/accumulo_user_manual#_combiners>. They
> seem more relevant than Filters.
>
> I don't know what you mean when you write that your output is not visible
> to "the complete Database".
>
> Regards
41 matches
Mail list logo