Re: [DISCUSS] Time to thank Hadoop for it's service and remove it?

Gus Heck Mon, 21 Oct 2024 14:21:12 -0700

I was thinking about this recently too. I search our archives and the only
interesting email regarding hadoop was a response in which Dave Smiley
pointed out that the backend is pluggable and thus it could be used to
target S3... but probably if we want to support an S3 storage backend, this
should be done more directly and with a clear notion of how to avoid
wasting duplication on replicas when s3 already has its own redundancy.  (I
did see some mention of replicas creating needless redundancy on hadoop).
There were a whole passel of CVE related emails that referenced CVE's in
hadoop libs however.


So I have seen little evidence that anyone uses this integration anymore.
This probably however should be posed to the user list as well.

If we can't drum up a response there, I'm definitely +1 to lightening the
load via options 1 or 4.

On Mon, Oct 21, 2024 at 5:05 PM David Eric Pugh <[email protected]>
wrote:

> I just re-read my copy of Marie Kondo's book The Life-Changing Magic of
> Tidying Up[1]  and it brought to mind the state of our Hadoop integrations
> with Solr.   I'd like to gauge the community's thoughts on how we move
> forward with Hadoop in Solr 10.
> My perspective is that Hadoop is no longer a key part of Solr's future,
> and that is reflected in it's lack of maintenance and tech debt that it
> appears to carrying.    We seem to have a lot of points of discussion where
> we want to do something and "but Hadoop doesn't support it" or "the tests
> for Hadoop fail".
>
> I believe everything we would be removing is in:* Hadoop Auth Module* HDFS
> Module
>
> If it's useful to the community I can make a longer argument about why we
> need to thank Hadoop for it's service and say good bye.
>
> Otherwise, I think these are our paths forward:
> 1) Just straight up remove both modules in Solr 10 like we did with
> analytics.
> 2) Move both modules to the solr-sandbox repository.  Can we just leave
> them there on Solr 9 and see if they get some new life?3) Actively recruit
> someone to be a committer focused on the Hadoop code to bring them up to
> date and allow them to stay?  I would want to time box this effort.
> 4) If someone volunteers to maintain them, move those modules to
> independent GitHub repos like we did with DIH.
> Thoughts?  Other suggestions?
>
> Eric
>
>
>
> [1] https://en.wikipedia.org/wiki/Marie_Kondo
>
>

-- 
http://www.needhamsoftware.com (work)
https://a.co/d/b2sZLD9 (my fantasy fiction book)

Re: [DISCUSS] Time to thank Hadoop for it's service and remove it?

Reply via email to