Hi Von,

Thank you for your suggestions. I have answered your queries on the doc.
After your replies, I will create a formal proposal as per Google guidelines
<https://google.github.io/gsocguides/student/writing-a-proposal> for review
as the submission period has now started.

Thanks,
Sohaib

On Mon, Mar 12, 2018 at 9:48 AM, Von Gosling <vongosl...@apache.org> wrote:

> Hi Sohaib,
>
> I have reviewed and made some suggestion for your concern problems.
>
> For all other GSoC students, Could we do some practice like Sohaib,
> looking forward to your proposal on the Google Doc.
>
> Best Regards,
> Von Gosling
>
> > 在 2018年3月9日,15:17,Sohaib Iftikhar <sohaib1...@gmail.com> 写道:
> >
> > Hi Yukon,
> >
> > What do you suggest for the key store itself? Do you propose writing this
> > ourselves or using some existing solution and writing a layer on top?
> >
> > Thanks,
> > Sohaib
> >
> > On Fri, Mar 9, 2018 at 6:20 AM, yukon <yu...@apache.org> wrote:
> >
> >> ```
> >> Personally, I find RAFT to be much simpler to implement. However, I do
> not
> >> expect to reinvent the wheel here.
> >> ```
> >>
> >> That's absolutely right, no need to reinvent the wheel, there are many
> >> existing implementations for raft: https://raft.github.io/
> >>
> >> ```
> >> I don't think using key store to persist all the messages is a good
> idea.
> >> ```
> >>
> >> Yes, store an ID is enough.
> >>
> >>
> >> On Thu, Mar 8, 2018 at 3:32 PM, Sohaib Iftikhar <sohaib1...@gmail.com>
> >> wrote:
> >>
> >>> Hi Dexin,
> >>>
> >>> Thank you for your suggestions. I will try to answer as much as I can
> and
> >>> leave the rest to the RocketMQ team.
> >>>
> >>> 1. The idea with incremental Ids is actually quite good. But @Yukon
> >>> mentioned that duplication can also be controlled by an application
> >>> (special KV Property) in which case different producers may produce the
> >>> same message that needs to deduplicated on the broker.
> >>> SessionId+IncrementalId won't work in this scenario I believe. But we
> can
> >>> actually switch to more efficient storage using the idea you described
> >> when
> >>> the user is not specifying these special keys.
> >>> Also I proposed storing of keys for only a fixed time interval. For all
> >>> practical purposes this would still remain constant time. [Log base 2
> of
> >>> 10^10 is still just 33 :) ]. It does add the extra cost of
> communication
> >>> but this would be the case in both scenarios.
> >>> 2. As for consensus, the ideas I presented were pretty abstract so I
> >>> mentioned a couple of algorithms that could potentially be used.
> >>> Personally, I find RAFT to be much simpler to implement. However, I do
> >> not
> >>> expect to reinvent the wheel here. I strongly believe that in this
> case,
> >> we
> >>> can build upon some tested existing solution.
> >>>
> >>>
> >>> Regards,
> >>> Sohaib
> >>>
> >>> On Thu, Mar 8, 2018 at 1:31 AM, 李 德鑫 <dexi...@outlook.com> wrote:
> >>>
> >>>> Hi Sohaib,
> >>>>
> >>>>
> >>>> I‘m a student applying for GSOC too. And I've read all of your
> >> discussion
> >>>> in the mail list.
> >>>>
> >>>> I have some questions about your design, and some of the questions may
> >>>> need to be answered by RocketMQ team. So I send them here to be
> >>> discussed.
> >>>>
> >>>> I don't think using key store to persist all the messages is a good
> >> idea.
> >>>> Since MQ is based on O(1) data structure. The key store would harm the
> >>>> performance.
> >>>>
> >>>> I think we can learn from TCP protocol.
> >>>>
> >>>> In Producer-Broker Communication, we can give an incremental id for
> >> every
> >>>> message sent in the same session. And the session id should be
> >> persistent
> >>>> on the disk for producer. So the broker only need to maintain a map
> >>> between
> >>>> session id to expected message id(And this is how Kafka do it). Since
> >>>> messages are much more than producers. However, there's still a K/V
> >> store
> >>>> needed. So we have to ask RocketMQ team about how many producers in
> the
> >>>> same time while in practical situation.
> >>>>
> >>>> Also, the same idea in Consumer-Broker Communication.
> >>>>
> >>>>
> >>>> About consensus algorithm, I think RocketMQ should already have an
> >>>> implementation there. I don't know what it is, but maybe you can reuse
> >>>> that. Or what if you have to implement one, in my opinion, there's no
> >>> need
> >>>> to implement both Paxos and Raft. Since they solve the same kind of
> >>>> problems.
> >>>>
> >>>>
> >>>>
> >>>> Regards,
> >>>>
> >>>> Dexin
> >>>>
> >>>>
> >>>> ________________________________
> >>>> 发件人: Sohaib Iftikhar <sohaib1...@gmail.com>
> >>>> 发送时间: 2018年3月7日 18:15:51
> >>>> 收件人: dev@rocketmq.apache.org
> >>>> 主题: Re: [GSOC|ROCKETMQ-124] Support non-redundant message delivery
> >>>> mechanism
> >>>>
> >>>> Hi Yukon,
> >>>>
> >>>> Thanks for your reply. Yes, it would be nice to concretely define the
> >>> scope
> >>>> of this project as the doc is a bit ambitious for just a summer.
> Should
> >>> you
> >>>> (or anyone else) have questions/suggestions/clarifications I'd be
> glad
> >>> to
> >>>> discuss more details.
> >>>>
> >>>> Thanks,
> >>>> Sohaib
> >>>>
> >>>> On Wed, Mar 7, 2018 at 8:58 AM, yukon <yu...@apache.org> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> Google doc is better for discussion, your design is great, now we
> >> could
> >>>>> discuss more details base on it.
> >>>>>
> >>>>> Any advice is welcome from RocketMQ community.
> >>>>>
> >>>>> Appreciate your efforts.
> >>>>>
> >>>>> Regards,
> >>>>> yukon
> >>>>>
> >>>>> On Wed, Mar 7, 2018 at 5:15 AM, Sohaib Iftikhar <
> >> sohaib1...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> @Yukon Thank you for your reply. This clears some doubts.
> >>>>>>
> >>>>>> Sorry for the delay as I was somewhat occupied with another
> >> project.
> >>> I
> >>>>> have
> >>>>>> created an initial design doc. Email is a bit cumbersome for
> >>> feedback I
> >>>>>> wrote this document in two formats:
> >>>>>>
> >>>>>> 1. In the form of a Google document:
> >>>>>> https://docs.google.com/document/d/1KSpXGNDH0HF5E27lfKJxJnjIjPtlP
> >>>>>> 1Q-M6rj3yZde24.
> >>>>>> The document is open for comments to all users without signing in.
> >> I
> >>>>> would
> >>>>>> appreciate it if you put your name before the comment so I can
> >>> identify
> >>>>> who
> >>>>>> to follow up the discussion with.
> >>>>>>
> >>>>>> 2. As a markdown on github:
> >>>>>> https://github.com/sohaibiftikhar/rocketmq/blob/
> >>>>> gsoc_design/gsoc_design.md
> >>>>>> .
> >>>>>> The comments for this can be made on the commit:
> >>>>>> https://github.com/sohaibiftikhar/rocketmq/commit/
> >>>>>> dfd55fc69f430fc024217a3b20dde31717334e62
> >>>>>>
> >>>>>> After I have received a certain amount of feedback I will try to
> >>>>>> incorporate it and put in a subsequent version for review. Please
> >>> tell
> >>>> me
> >>>>>> which methods suits you better (gdoc or github) for review and we
> >> can
> >>>>>> continue with that for the subsequent versions.
> >>>>>>
> >>>>>> Lastly, the document is a couple of pages so I appreciate your
> >>> patience
> >>>>> and
> >>>>>> your help.
> >>>>>> Looking forward to your opinions.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Sohaib
> >>>>>>
> >>>>>> On Mon, Mar 5, 2018 at 1:01 PM, yukon <yu...@apache.org> wrote:
> >>>>>>
> >>>>>>> Hi Sohaib,
> >>>>>>>
> >>>>>>> Sorry for the late reply, we could move this project forward now
> >> ~
> >>>>>>>
> >>>>>>> ```
> >>>>>>> I would at some point like to post
> >>>>>>> design ideas to this problem privately to get it reviewed by the
> >>>>>>> development community but not make it publicly available so that
> >> it
> >>>>>> cannot
> >>>>>>> be plagiarised.
> >>>>>>> ```
> >>>>>>>
> >>>>>>> You can send your design ideas to me directly or to our PMC list(
> >>>>>>> priv...@rocketmq.apache.org) if you want to make your ideas
> >>>> privately.
> >>>>>> But
> >>>>>>> please don't break away from the community.
> >>>>>>>
> >>>>>>> I hope you have already understood the goal of this project. Now,
> >>>>>> RocketMQ
> >>>>>>> support At-least-once delivery, it's an obvious solution
> >>>>>>> that achieves Exactly-Once by removing duplicated messages.
> >>>>>>>
> >>>>>>> Return to your original questions:
> >>>>>>>
> >>>>>>> 1. What defines a redundant message?
> >>>>>>>
> >>>>>>> A message id will be generated when new a message, so this id can
> >>> be
> >>>>> used
> >>>>>>> to identify a message. Also, the user could specify a unique
> >>>>>>> business-related property to identify a message.
> >>>>>>>
> >>>>>>> The redundant messages will occur when the network is broken or
> >>>>>>> reconnected, rebalance[1] is triggered, etc.
> >>>>>>>
> >>>>>>>
> >>>>>>> 2. Is their a timeline on the redundant messages?
> >>>>>>>
> >>>>>>> Yes, keep all messages nonredundant is expensive, let's consider
> >>> this
> >>>>>>> question within a certain time window ~
> >>>>>>>
> >>>>>>> Looking forward to your design.
> >>>>>>>
> >>>>>>> [1].
> >>>>>>> https://github.com/apache/rocketmq/blob/master/client/
> >>>>>>> src/main/java/org/apache/rocketmq/client/impl/consumer/
> >>>>>>> RebalanceService.java
> >>>>>>>
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> yukon
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Mar 2, 2018 at 9:31 PM, Sohaib Iftikhar <
> >>>> sohaib1...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> @Zhanhui Thanks for the response. This is not a campaign its
> >> just
> >>>>> part
> >>>>>> of
> >>>>>>>> GSoC (https://summerofcode.withgoogle.com/). And community
> >> help
> >>> is
> >>>>>>> gladly
> >>>>>>>> welcomed. In fact, it is recommended :)
> >>>>>>>>
> >>>>>>>> @KaiYuan Thanks for your suggestions. I will come up with a
> >> flow
> >>>>> chart
> >>>>>>> for
> >>>>>>>> the proposed solution this weekend.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Sohaib
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Mar 2, 2018 at 3:41 AM, Zhanhui Li <
> >> lizhan...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Sohaib,
> >>>>>>>>>
> >>>>>>>>> I have been sort of busy this these days. Sorry to reply you
> >> so
> >>>>> late!
> >>>>>>>>>
> >>>>>>>>> So sure what “deadline” you are referring to. If this is part
> >>> of
> >>>> a
> >>>>>>>>> campaign, I have to admit I am not aware of the regulations
> >> and
> >>>>> what
> >>>>>>> kind
> >>>>>>>>> of help I should offer to maintain fairness considering other
> >>>>> arising
> >>>>>>>>> similar issues.
> >>>>>>>>>
> >>>>>>>>> Regards!
> >>>>>>>>>
> >>>>>>>>> Zhanhui Li
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> 在 2018年3月1日,上午3:43,Sohaib Iftikhar <sohaib1...@gmail.com>
> >>> 写道:
> >>>>>>>>>>
> >>>>>>>>>> Hi guys,
> >>>>>>>>>>
> >>>>>>>>>> Would be nice to have some feedback on this as the deadline
> >>> is
> >>>>> not
> >>>>>>> too
> >>>>>>>>> far :)
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Sohaib
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Sohaib Iftikhar
> >>>>>>>>>>
> >>>>>>>>>> -- Man is still the most extraordinary computer of all.--
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Feb 26, 2018 at 10:36 AM, Sohaib Iftikhar <
> >>>>>>>> sohaib1...@gmail.com
> >>>>>>>>> <mailto:sohaib1...@gmail.com>> wrote:
> >>>>>>>>>> Thank you for the pointers to the code. This was super
> >>> helpful.
> >>>>> The
> >>>>>>>>> multiple keys can probably be serialized better than
> >> separating
> >>>>> them
> >>>>>>>> with a
> >>>>>>>>> space but that is already legacy I suppose.
> >>>>>>>>>>
> >>>>>>>>>> Firstly filters like bloom or cuckoo are heuristic. They
> >> can
> >>>> help
> >>>>>>> make
> >>>>>>>>> things faster but definitely cannot be used as the only
> >>> solution.
> >>>>>>> Hence,
> >>>>>>>> in
> >>>>>>>>> the end, we will still need a persistent keystore/distributed
> >>>> set.
> >>>>> My
> >>>>>>>> plan
> >>>>>>>>> was to have this keystore as distributed (raft guarantee
> >> etc.).
> >>>> The
> >>>>>>>>> keystore can also hold a persistent filter on its end. If a
> >>>> broker
> >>>>>>>>> collapses it can renew/refresh its filter from the keystore.
> >>>> Hence
> >>>>>>>>> eliminating the problems about crashes that you mention. The
> >>>>> problem
> >>>>>>> here
> >>>>>>>>> could be in maintaining performance for filters in case of
> >>>> removals
> >>>>>>> from
> >>>>>>>>> the keystore (for eg: sliding windows as mentioned in my
> >>> previous
> >>>>>>> mail).
> >>>>>>>>> Periodic refreshal of filters can help solve this but I am
> >> open
> >>>> to
> >>>>>>>>> suggestions on how to make this better.
> >>>>>>>>>>
> >>>>>>>>>> I think implementing a distributed set on the client
> >> cluster
> >>>> has
> >>>>>> its
> >>>>>>>>> caveats. The way I understand RocketMQ is that we do not have
> >>>>> control
> >>>>>>>> over
> >>>>>>>>> the diskspace/memory on the client end. So we probably only
> >>> have
> >>>> a
> >>>>>>>> constant
> >>>>>>>>> amount. A distributed set on the client would also need to be
> >>>>>>> persistent.
> >>>>>>>>> For eg: if a client restarts/recovers etc. This basically
> >> means
> >>>> we
> >>>>>>> need a
> >>>>>>>>> keystore on the client instead of the broker cluster. This
> >>>> probably
> >>>>>>> puts
> >>>>>>>>> too much responsibility on the client cluster. A different
> >>>> approach
> >>>>>>> would
> >>>>>>>>> be to ensure that the offsets are always in sync with the
> >>> broker.
> >>>>>> Since
> >>>>>>>> the
> >>>>>>>>> broker only serves unique messages (based on the proposed
> >>>> solution
> >>>>> on
> >>>>>>> the
> >>>>>>>>> producer/broker end) all we need to ensure is that a client
> >>> does
> >>>>> not
> >>>>>>>>> consume messages with the same offset twice.
> >>>>>>>>>>
> >>>>>>>>>> Please suggest improvements if this does not look like the
> >>>>> correct
> >>>>>>>>> approach. Also would be great if someone can come up with a
> >>>>>> completely
> >>>>>>>>> different approach so that we can weigh up pros and cons.
> >>>>>>>>>>
> >>>>>>>>>> Thanks for reading this through and looking forward to your
> >>>>>> opinions.
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Sohaib
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Sohaib Iftikhar
> >>>>>>>>>>
> >>>>>>>>>> -- Man is still the most extraordinary computer of all.--
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Feb 26, 2018 at 3:58 AM, Zhanhui Li <
> >>>> lizhan...@gmail.com
> >>>>>>>>> <mailto:lizhan...@gmail.com>> wrote:
> >>>>>>>>>> Hi Sohaib,
> >>>>>>>>>>
> >>>>>>>>>> About multiple key support, the following code snippet
> >> should
> >>>>>> clarify
> >>>>>>>>> your doubt:
> >>>>>>>>>> org.apache.rocketmq.common.message.Message class has
> >>>> overloaded
> >>>>>>>> setKeys
> >>>>>>>>> methods, allowing your to set multiple keys via
> >>> string(separated
> >>>> by
> >>>>>>>>> space…sorry, we have not yet unified all separators, hoping
> >>> this
> >>>>> does
> >>>>>>> not
> >>>>>>>>> confuse you) or collection.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> When broker tries to build index for the message with
> >>> multiple
> >>>>>> keys,
> >>>>>>>>> multiple index entries are inserted into the indexing file.
> >>>>>>>>>> See org.apache.rocketmq.store.
> >> index.IndexService#buildIndex
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> In terms of eliminating message duplication, personally, I
> >>> wish
> >>>>> we
> >>>>>>> have
> >>>>>>>>> a feature of exactly-once semantic covering the whole cluster
> >>> and
> >>>>> the
> >>>>>>>>> complete send-store-consume processes. A rough idea is route
> >>> the
> >>>>>>> message
> >>>>>>>>> according to its unique key to a broker according to a rule;
> >>> The
> >>>>>>> serving
> >>>>>>>>> broker ensures uniqueness of the message according to the
> >> key(
> >>> as
> >>>>> you
> >>>>>>>> said,
> >>>>>>>>> bloom-filter/cuckoo-filter, etc);  Things might looks simple,
> >>> but
> >>>>>>> issues
> >>>>>>>>> resides in scenarios where cluster is experiencing membership
> >>>>>> changes:
> >>>>>>>> for
> >>>>>>>>> example, what if a broker crashed down? We might need
> >> propagate
> >>>>>>>>> bloom-filter bitset synchronously to other brokers having the
> >>>> same
> >>>>>>>> topics;
> >>>>>>>>> What if a new broker joins in the cluster and starts to
> >> serve?
> >>> I
> >>>> do
> >>>>>> not
> >>>>>>>>> mean this is too complex to implement. Instead, this is a
> >>> pretty
> >>>>>>>>> interesting topic and fancy feature to have. Alternatively,
> >> we
> >>>>> might
> >>>>>>>> defer
> >>>>>>>>> eliminating duplicates to the consumption phase using kind of
> >>>>>>> distributed
> >>>>>>>>> set. For sure, my proposing idea suffers the same challenges
> >>>>>> including
> >>>>>>>>> membership changes.
> >>>>>>>>>>
> >>>>>>>>>> Guys of dev board, any insights on this issue?
> >>>>>>>>>>
> >>>>>>>>>> Zhanhui Li
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> 在 2018年2月26日,上午2:47,Sohaib Iftikhar <sohaib1...@gmail.com
> >>>>>> <mailto:
> >>>>>>>>> sohaib1...@gmail.com>> 写道:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Zhanhui,
> >>>>>>>>>>>
> >>>>>>>>>>> I have a doubt about these multiple keys. If I am wrong in
> >>> any
> >>>>> of
> >>>>>>> the
> >>>>>>>>>>> assumptions I make please point it out.
> >>>>>>>>>>>
> >>>>>>>>>>> If there is support for multiple keys I cannot see this in
> >>> the
> >>>>>> code.
> >>>>>>>> The
> >>>>>>>>>>> class Message only stores a single key in the property map
> >>>>> against
> >>>>>>> the
> >>>>>>>>>>> property name "KEYS". Is this also done in the same ways
> >> as
> >>>>> tags?
> >>>>>>> That
> >>>>>>>>> is
> >>>>>>>>>>> different keys are separated with ' || '? So basically as
> >> a
> >>>> user
> >>>>>> of
> >>>>>>>> the
> >>>>>>>>>>> producer API it is the user's responsibility to ensure
> >> that
> >>> he
> >>>>>>>> separates
> >>>>>>>>>>> the different keys with the correct separator. I can see
> >> an
> >>>>>> obvious
> >>>>>>>>> problem
> >>>>>>>>>>> here. What if the key contains this special character ' ||
> >>> '?
> >>>>> But
> >>>>>>>> maybe
> >>>>>>>>>>> this event is rare and hence this is not important. Could
> >>> you
> >>>>>> point
> >>>>>>> me
> >>>>>>>>> to
> >>>>>>>>>>> some source/doc that explains this part? I was looking at
> >>> the
> >>>>>> index
> >>>>>>>>> section
> >>>>>>>>>>> rocketmq-store but I have not been able to understand the
> >>>>> indexing
> >>>>>>>>> process
> >>>>>>>>>>> completely for now. I will keep reading the source to get
> >> a
> >>>>> better
> >>>>>>>> idea.
> >>>>>>>>>>>
> >>>>>>>>>>> Moving on to the implementational details. Here is a broad
> >>>> idea
> >>>>> of
> >>>>>>> one
> >>>>>>>>>>> possible way to approach it.
> >>>>>>>>>>>
> >>>>>>>>>>> The attempt is to remove duplicate messages. In this
> >> issue,
> >>> I
> >>>>>> would
> >>>>>>>>> like to
> >>>>>>>>>>> aim at eliminating duplicate messages at the
> >> producer/broker
> >>>>> end.
> >>>>>>> For
> >>>>>>>>> now,
> >>>>>>>>>>> we do not concern ourselves with the duplicate messages
> >>>>> happening
> >>>>>>> due
> >>>>>>>> to
> >>>>>>>>>>> unwritten consumer offsets as these two issues have
> >>> different
> >>>>>>>> solutions.
> >>>>>>>>>>> One way to solve this problem at the producer/broker end
> >>> could
> >>>>> be
> >>>>>> to
> >>>>>>>>> have a
> >>>>>>>>>>> distributed key store that stores the messages. We can
> >> make
> >>> it
> >>>>>>>>> configurable
> >>>>>>>>>>> such that this distributed store stores all messages or
> >>> works
> >>>>> as a
> >>>>>>>>> sliding
> >>>>>>>>>>> window keeping only the messages from the last X seconds
> >>>>> specified
> >>>>>>> by
> >>>>>>>>> the
> >>>>>>>>>>> user. We can have a layer on top to check set membership
> >>> such
> >>>>> as a
> >>>>>>>> bloom
> >>>>>>>>>>> filter or a cuckoo filter (
> >>>>>>>>>>> https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf
> >> <
> >>>>>>>>> https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf>)
> >> to
> >>>> help
> >>>>>>>>>>> performance. Every message being pushed in by a producer
> >> are
> >>>>>> checked
> >>>>>>>> in
> >>>>>>>>>>> first with the filter and in case of a positive result
> >> with
> >>>> this
> >>>>>> key
> >>>>>>>>> store.
> >>>>>>>>>>> If the message is found then it is discarded. This helps
> >>>> remove
> >>>>>>>>> duplicates
> >>>>>>>>>>> completely from a producer perspective. The core of this
> >>> idea
> >>>> is
> >>>>>> the
> >>>>>>>>>>> distributed key store which would be completely separate
> >>> from
> >>>>> the
> >>>>>>>>> current
> >>>>>>>>>>> message storage. Since the concept of a distributed key
> >>> store
> >>>>> or a
> >>>>>>>>>>> key/value store is not novel there are two ways to this.
> >>>>>>>>>>> 1. Implement it ourselves. This would be high effort but
> >> no
> >>>>>> external
> >>>>>>>>>>> dependencies.
> >>>>>>>>>>> 2. Use a key-value store such as Redis (which already has
> >>>>> timeouts
> >>>>>>> and
> >>>>>>>>>>> persistence but a large memory footprint) or some other
> >>>>> disk-based
> >>>>>>>>> storage
> >>>>>>>>>>> for set membership. This would include an external
> >>> dependency
> >>>>> but
> >>>>>>>>>>> development time will reduce significantly for such a
> >>>> solution.
> >>>>>>>>>>> I am inclined towards implementing it by myself as this
> >>> would
> >>>>>> avoid
> >>>>>>>>>>> dependencies on other products especially since RocketMQ
> >> is
> >>>>>>> currently
> >>>>>>>> a
> >>>>>>>>>>> self-reliant system. In addition, my past experience with
> >>>>> building
> >>>>>>>> such
> >>>>>>>>> a
> >>>>>>>>>>> store should also come in handy.
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to know the opinions of the development
> >>> community
> >>>>> on
> >>>>>>> this
> >>>>>>>>>>> approach and to suggest improvements on it. Looking
> >> forward
> >>> to
> >>>>>> your
> >>>>>>>>>>> responses to this.
> >>>>>>>>>>>
> >>>>>>>>>>> ====<question unrelated to issue>=====
> >>>>>>>>>>> To increase my familiarity with the code base and to help
> >>>> prove
> >>>>>>> that I
> >>>>>>>>> am
> >>>>>>>>>>> familiar with the tools and technologies in place it would
> >>> be
> >>>>>> great
> >>>>>>>> if I
> >>>>>>>>>>> could be pointed to some low effort issues that I could
> >> help
> >>>> out
> >>>>>>> with.
> >>>>>>>>> In
> >>>>>>>>>>> case there are no 'newbie' issues available I could help
> >>>> improve
> >>>>>> the
> >>>>>>>>>>> comments inside the codebase. I noticed some source files
> >>> with
> >>>>> no
> >>>>>>>>>>> explanations which can be documented via comments to help
> >>>>> onboard
> >>>>>> a
> >>>>>>>> new
> >>>>>>>>>>> contributor faster.
> >>>>>>>>>>> ====</question unrelated to issue>=====
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks a lot for reading this through and looking forward
> >> to
> >>>>> your
> >>>>>>>>> opinions.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Sohaib
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Sat, Feb 24, 2018 at 11:50 AM, Zhanhui Li <
> >>>>> lizhan...@gmail.com
> >>>>>>>>> <mailto:lizhan...@gmail.com>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Sohaib,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Happy to know you are interested in RocketMQ.
> >>>>>>>>>>>>
> >>>>>>>>>>>> First, let me answer questions you raised.
> >>>>>>>>>>>>
> >>>>>>>>>>>> — can there be multiple tags?
> >>>>>>>>>>>> No. At present, the storage engine allows single tag
> >> only.
> >>>>>>>>> Subscriptions
> >>>>>>>>>>>> are allowed to use combination of tags. The current model
> >>>>> should
> >>>>>>> meet
> >>>>>>>>> your
> >>>>>>>>>>>> business development. If not, please let us know.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> — key (Similar question to above.)
> >>>>>>>>>>>> RocketMQ builds index using message keys. A single
> >> message
> >>>> may
> >>>>>> have
> >>>>>>>>>>>> multiple keys.
> >>>>>>>>>>>>
> >>>>>>>>>>>> — About redundant message
> >>>>>>>>>>>> From my understanding, you are trying to eliminate
> >>> duplicate
> >>>>>>>> messages.
> >>>>>>>>>>>> True there are various reasons which may cause message
> >>>>>> duplication,
> >>>>>>>>> ranging
> >>>>>>>>>>>> from message delivery and consumption. Discussion on this
> >>>> topic
> >>>>>> is
> >>>>>>>>> warmly
> >>>>>>>>>>>> welcome.  Had you had any idea to contribute on this
> >> issue,
> >>>> the
> >>>>>>>>> developer
> >>>>>>>>>>>> board is happy to discuss.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Zhanhui Li
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> 在 2018年2月24日,上午11:17,Sohaib Iftikhar <
> >>> sohaib1...@gmail.com
> >>>>>>> <mailto:
> >>>>>>>>> sohaib1...@gmail.com>> 写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> My earlier email message seems to have gotten lost. So I
> >>>> will
> >>>>>> try
> >>>>>>>>> again.
> >>>>>>>>>>>>> Please see the original message for the discussion.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> Sohaib Iftikhar
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -- Man is still the most extraordinary computer of
> >> all.--
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Feb 20, 2018 at 1:54 AM, Sohaib Iftikhar <
> >>>>>>>>> sohaib1...@gmail.com <mailto:sohaib1...@gmail.com>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I am interested in working on this issue (
> >>>>>>>> https://issues.apache.org/
> >>>>>>>>> <https://issues.apache.org/>
> >>>>>>>>>>>>>> jira/browse/ROCKETMQ-124) as part of GSOC-18. I have a
> >>> few
> >>>>>>>> questions
> >>>>>>>>> for
> >>>>>>>>>>>>>> the same. I am not sure if this discussion needs to be
> >> on
> >>>> the
> >>>>>>> JIRA
> >>>>>>>>>>>> issue or
> >>>>>>>>>>>>>> here. Feel free to correct me if this is the wrong
> >>>> platform.
> >>>>>> Also
> >>>>>>>>> while
> >>>>>>>>>>>> I
> >>>>>>>>>>>>>> have worked with distributed pub-sub systems I am still
> >>>>> fairly
> >>>>>>> new
> >>>>>>>> to
> >>>>>>>>>>>>>> Rocket-MQ so maybe my understanding of it is
> >> incorrect. I
> >>>>>>> apologise
> >>>>>>>>> if
> >>>>>>>>>>>> that
> >>>>>>>>>>>>>> is the case and would be happy to stand corrected.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Following are my questions:
> >>>>>>>>>>>>>> 1. What defines a redundant message?
> >>>>>>>>>>>>>>  The constructor that I see for a message is as
> >> follows:
> >>>>>>>>>>>>>>  Message(String topic, String tags, String keys, int
> >>> flag,
> >>>>>>> byte[]
> >>>>>>>>>>>> body,
> >>>>>>>>>>>>>> boolean waitStoreMsgOK)
> >>>>>>>>>>>>>>  Possible candidates to me are topic, tags (can there
> >> be
> >>>>>>> multiple
> >>>>>>>>>>>> tags?
> >>>>>>>>>>>>>> I could not find an example for this. If yes how are
> >> they
> >>>>>>>>> separated?),
> >>>>>>>>>>>> keys
> >>>>>>>>>>>>>> (Similar question to above.) and of course the body. Is
> >>>> there
> >>>>>>>>> something
> >>>>>>>>>>>>>> that I have missed in this? Is there something that we
> >> do
> >>>> not
> >>>>>>> need
> >>>>>>>> to
> >>>>>>>>>>>>>> consider?
> >>>>>>>>>>>>>> 2. Is their a timeline on the redundant messages? What
> >> I
> >>>> mean
> >>>>>> by
> >>>>>>>>> this is
> >>>>>>>>>>>>>> that is there a time limit after which a message with
> >>>> similar
> >>>>>>>>> content is
> >>>>>>>>>>>>>> allowed. From what I gather there was no such thing
> >>>>> mentioned.
> >>>>>>> This
> >>>>>>>>>>>> would
> >>>>>>>>>>>>>> mean storing all the messages. Depending on the
> >>>> requirements
> >>>>>> this
> >>>>>>>>> may or
> >>>>>>>>>>>>>> may not be the best solution. It might be desirable
> >> that
> >>> no
> >>>>>>>>> duplicates
> >>>>>>>>>>>> are
> >>>>>>>>>>>>>> needed within a certain time window (sliding). This
> >>> allows
> >>>>>>> ignoring
> >>>>>>>>> of
> >>>>>>>>>>>>>> duplicate messages that were generated very close to
> >> each
> >>>>> other
> >>>>>>> (or
> >>>>>>>>> in
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> window indicated). Depending on this requirement
> >>>>> implementation
> >>>>>>> may
> >>>>>>>>>>>> become
> >>>>>>>>>>>>>> a little bit more involved.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> For now, these are the only questions. I have ideas
> >> that
> >>>> need
> >>>>>>>> review
> >>>>>>>>>>>> about
> >>>>>>>>>>>>>> possible implementations but I will mention them once
> >> the
> >>>>>>>>> specifications
> >>>>>>>>>>>>>> are clear to me. As an end question, I would at some
> >>> point
> >>>>> like
> >>>>>>> to
> >>>>>>>>> post
> >>>>>>>>>>>>>> design ideas to this problem privately to get it
> >> reviewed
> >>>> by
> >>>>>> the
> >>>>>>>>>>>>>> development community but not make it publicly
> >> available
> >>> so
> >>>>>> that
> >>>>>>> it
> >>>>>>>>>>>> cannot
> >>>>>>>>>>>>>> be plagiarised. What platform/method can I use to do
> >>> that?
> >>>> Or
> >>>>>> is
> >>>>>>>>>>>> submitting
> >>>>>>>>>>>>>> a draft to the Google platform the only possible way to
> >>>>>>> accomplish
> >>>>>>>>> this?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks a lot for reading this through and looking
> >> forward
> >>>> to
> >>>>>> your
> >>>>>>>>>>>> inputs.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Sohaib Iftikhar
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Reply via email to