mcvsubbu commented on issue #7929:
URL: https://github.com/apache/pinot/issues/7929#issuecomment-997435881
> > If you want to allocate memory in the realtime context, please use the
class RealtimeIndexOffheapMemoryManager . It will make sure to allocate memory
as per the configuration set by the admin -- memory mapped, or direct, etc. It
is also accounted for, and so can be used for provisioning purpose.
>
> Before my change `ByteBuffer.allocateDirect` was used, but I'll make a
note of this.
I meant if you wanted to change anything in the realtime mem allocation
path. Code has evolved, and more and more special casing has been added since
the time the memory manager was introduced. Not all changes have followed the
approach. Perhaps because it did not matter either way. As we moved forward to
make changes, we should keep that in mind. Now that you have done so, I am
fine, thanks.
>
> > Yes, we should not modify general logic for outlier use cases. In this
case, we had a production issue, and had to revert the deployment, and spend
multiple days trying to reproduce the problem, narrowing down the commit and
then identifying a problem within that commit.
>
> I apologise for the inconvenience caused by the change, but it's anything
but an outlier use case. New user experience has to be considered - if a new
user stores text (web scraping, XML, JSON, typically no control over max
length) in a raw index, they will find it requires a lot of memory. How do we
prevent that new user from having a bad experience at the same time as avoiding
finding out about inadvertent regressions in your production environment? Just
not making changes would be a good way to avoid regressions in your production
environment, but it leaves new users whose workloads are slightly different to
yours out in the cold. To avoid a repeat incident, firstly, we need better
performance tests so that unconsidered codepaths can't regress without anyone
knowing. Secondly, I repeat the suggestion to decouple fixed width and variable
width storage, because the implementation has a downside either way so long as
it's shared.
Perhaps we differ a bit here. A new user of today is just starting off. We
must definitely be cognizant of revenue-impacting production use cases already
running on the system. Today's new user will be tomorrow's production user, and
will not be happy if a new release breaks something in production :-) So, we
should be careful if we consider new use cases to be the most common ones.
That being said, there should be a balance. I am definitely not suggesting
not making any changes, and I believe you know that :-). And some of these
(especially performance bugs) are hard to catch before we go live. We do have
performance testing as well in our pipeline, but this was not caught. Exactly
one out of 150+ use cases happened to break :-) , so I would go as far to claim
that it is impossible to guarantee that no regressions happen. We all commit
code and do the best we can to be correct, and learn along the way as well.
If you can please work with @sajjad-moradi and arrive at a common solution
that will be appreciated, thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]