On Mar 19, 2013, at 11:40 AM, Erick Erickson <[email protected]> wrote:

> bq: In particular, there is some code that says, don't lock for these it 
> causes deadlock 
> 
> There's no place where I intentionally put in code that didn't lock shared 
> variables etc. to avoid deadlock and then crossed my fingers hoping all went 
> well. Which says nothing about whether there are inadvertent places of course.
> 
> There are some notes about why locks are obtained long enough to copy from 
> shared variables to local variables since locking the shared stuff while, 
> say, closing cores would (and did) lead to deadlock, e.g. 
> CoreContainer.clearMaps(). That's a case where the rest of the locking is 
> supposed to suffice.
> 
> And I remember one spot, although I can't find it right now, where I 
> intentionally did not obtain a lock but the code is, as I remember, 
> inaccessible except from code that _does_ have a lock.
> 
> So point me at the code you're uncomfortable with and I'll be happy to look 
> it over again, it's possible that my cryptic comments which were mostly to 
> myself are misleading you. Of course it's also quite possible that you're 
> totally right and I screwed up, wouldn't be the first time…..

I haven't had a chance to do a real review, so I'll wait to point out anything 
specifically - my main point is that I think the code could use some review and 
I think to this point it has not had any, at least that I've picked up. It's 
been on my list a long time and I still hope to get to it, but the fact that no 
one else has really looked yet, it makes me less comfortable rushing such a 
large change out. I agree that the stress test is comforting, but it still has 
some random fails - perhaps just test fails, but it all just calls out for some 
more eyes because it's a rather large, central change. This is a central part 
of Solr that everyone uses. 

It's not that I don't trust you, I just know how a big a change this is and I 
think it deserves a second pair of eyes at least in some spots. I'm mostly 
trying to frame the future when I talk about 5x - something like this seems 
like it should have been in 5x prominently for a while. At this point, it may 
be more work to go backward than forward on that front. I think about this 
because I've been into the idea of releasing fairly often on the 4.x branch - 
and Robert is a big releaser as well - so I'm going to be paying close 
attention to issues that make it a little harder to just release at any time.

> 
> I was kind of sad to see the pluggable core descriptor go, it seemed like 
> kind of a neat thing. But it didn't really have a compelling use-case over 
> auto-discovery so there's no good reason to bring it back. I suppose if we 
> bring it back (not suggesting it now, mind you) that we could use the 
> extracted manipulation in ConfigSolrXmlBackCompat (which will be renamed if 
> we pull the solr.properties file) as the template for an interface, but 
> that's for later.
> 
> But the properties way of doing things did seem awkward, so I'm not against 
> yanking it. Much of the other code is there (I'm thinking about all of the 
> pending core operations) to address shortcomings that have been there for a 
> while. We've been able to lazily load/unload cores since 4.1, I believe the 
> stress test running against 4.1 would _not_ be pretty so taking all that out 
> seems like a mistake.

If we can come to consensus on the next move, I'm happy to help dig into some 
of this. I'm still hopeful that it might be a somewhat minor change since it's 
really just altering the on disk format of the config file?

- Mark


> 
> 
> 
> 
> On Tue, Mar 19, 2013 at 8:02 AM, Mark Miller <[email protected]> wrote:
> 
> On Mar 19, 2013, at 9:44 AM, Erick Erickson <[email protected]> wrote:
> 
> > So are you talking about backing all this out of 4x or just taking the 
> > properties bits out? Because backing this all out of the 4x code line will 
> > be...er...challenging, much more challenging than just yanking the 
> > properties file bits out, this latter is much easier whereas reverting will 
> > be a pain, I'll need some help there. I'm not really willing to re-do all 
> > of the bits around preventing loading/unloading core operations occurring 
> > at the same time from the ground up, it was too painful to do the first 
> > time. But this isn't really a problem I don't think, since that code is 
> > only relevant if you specify the lazy core loading options in your cores.
> 
> I'm bringing up both things. The fact that neither our tests or example use 
> this new format yet scares me. It really feels like this didn't bake in 5x at 
> all if it wasn't part of the example or driving tests. The only baking it had 
> was it's own unit tests, but what it really needed was dev/user face time.
> 
> I'm open to driving this out in 4.3, but I'm pointing out where things are at 
> and that people should weigh in so we can solidify this and actually get the 
> changes into developers and users hands (beyond just the refactoring that has 
> gone into the old style back compat) or pull it from 4x until it gets some 
> real user time in 5x.
> 
> >
> > I don't quite get what you mean by solr.xml and solrconfig.xml being 
> > similar, just both being xml files?
> 
> I think they should both be xml files and with similar style, yes. Or 
> eventually both changed to another format, but in my mind they should be 
> consistent.
> 
> > And are we talking a discovery/no discovery attribute in, say, the <cores> 
> > tag?
> 
> Something like that - Robert tossed out some ideas, but we would have to work 
> through what is best.
> 
> > Nor do I quite get the bits around threading. The stress code does test the 
> > threading I think.
> 
> There are shared variables that are accessed without locks - at least there 
> were a couple days ago when I took a casual look. That's a no no. Might have 
> to do with some of the random stress tests fails we still have, but not get 
> tickled by the stress tests - it's a bug in either case, so I won't be happy 
> with it until that code has gotten some more review.
> 
> In particular, there is some code that says, don't lock for these it causes 
> deadlock - that's the biggest one I want to address - we can't sacrifice 
> safety in accessing shared vars to avoid deadlock - we have to solve both. 
> Now I have not looked in a few days, perhaps things have shifted, but I'd 
> like that code to get a solid review by someone else that knows CoreContainer 
> if possible.
> 
> > It also specifically does both properties and xml versions of solr.###, so 
> > it is part of our normal testing. Whether we can harden it enough for 4.3 
> > is certainly a legitimate question.
> 
> This is my worry. I know I've discussed a 4.3 with Robert within a relatively 
> short time. Meanwhile, this large change that we will be locked into with 
> back compat is still not part of the example or driving our tests. I'm 
> worried it should have been put in 5x first, and made part of the example at 
> least. I realize it may be too later for that now, which is why I'm 
> straddling this middle path of, I don't know if this is ready for 4.3 so 
> let's either pull it or try and make it ready. I don't think you did anything 
> wrong to get us here, I'm just stating what I feel given where we are.
> 
> The real key to making it ready in my mind is to change the example and force 
> more people into facing the new setup.
> 
> >
> > So are we reconsidering what I had a couple of months ago with pluggable 
> > core discovery? The default would be to do things as they're done now, we'd 
> > provide an implementation that walked the directory structure, and dropping 
> > in a different version that, say, queried ZK or a database would be pretty 
> > straight-forward.
> >
> > but if the solr.properties thing is to be killed, now is the time.
> 
> I think we are only talking about the format of the core container 
> configuration. I don't think it really changes much of what you have done or 
> wanted to do - it's simple the format of the prop file.
> 
> And getting more people to use that path so that we know it's right one and 
> shake out more problems/bugs!
> 
> I'm less concerned with the stability (although that concerns me since the 
> new "way" is mostly hidden right now and so not getting hit by devs or trunk 
> users unless they really go out of the way) and more concerned around being 
> locked into what we do here. So I'm hoping some other committers are forced 
> to look closer at this before we release it.
> 
> - Mark
> 
> >
> >
> >
> >
> > On Mon, Mar 18, 2013 at 8:42 PM, Mark Miller <[email protected]> wrote:
> > I've been thinking about this in the background. I tossed out some of the 
> > idea's that were implemented here, and I also thought a properties file 
> > rather than xml might make the most sense for container config. After all, 
> > 99% of the things you set will be simple key->value props that are 
> > attributes of the cores tag. I thought, wouldn't that just be simple as a 
> > properties file? Easy hand wavy thought at the time.
> >
> > I had forgotten that the shard handler configuration had crept into 
> > solr.xml. That made me think, what else will come creeping this way? Other 
> > solrconfig.xml stuff that doesn't belong per SolrCore. If my memory is 
> > right, shard handler was in solrconfig.xml at one point - so was the config 
> > for zookeeper in the very first integration attempt with Solr.
> >
> > Shouldn't the Container and SolrCore config mostly match? Wouldn't it be 
> > funny if was one yaml and the other xml? One a flat structure and the other 
> > nested? I think so.
> >
> > So I'm changing my mind on the properties files and favoring the use of the 
> > current solr.xml. It's a really easy switch for current users, it's 
> > consistent with what we have, and we can get 99% of the benefits of Erick's 
> > work even retaining solr.xml.
> >
> > I also think solr.xml and solrconfig.xml should be as similar as they can 
> > be.
> >
> > The real problem with solr.xml is still solved with Erick's work - the fact 
> > that cores are define there and Solr has to try and update the config file. 
> > The file is really not so bad once we remove that little ugly wart.
> >
> > I think there would be a few details to work out, but this path is very 
> > appealing to me.
> >
> > I don't think we should hurry this stuff - there is still some thread 
> > safety concerns I have, and I'm not sure it's gotten a lot of use since 
> > it's not part of the example or general tests. This is very important stuff 
> > IMO, and we want to get this change right, even if that means it doesn't 
> > make 4.3. 5x was almost invented for baking this type of change :)
> >
> > The faster we can come to a consensus, the faster we can get started baking 
> > things though.
> >
> > Anyone voting for the properties file?
> >
> > - Mark
> >
> > On Mar 14, 2013, at 3:46 PM, Robert Muir <[email protected]> wrote:
> >
> > > I'm late to the game here, so I apologize if a lot of this has already
> > > been discussed...
> > >
> > > I was looking recently and a little confused about the new .properties
> > > format, a few questions:
> > > * is the current plan to deprecate the solr.xml support in 4.x and drop 
> > > for 5.0?
> > > * is there a real advantage to the .properties format over the
> > > existing .xml? When debugging it seemed I was in unknown territory a
> > > little bit, and this sorta means going forward that everything in here
> > > is assumed to be flat: but it isnt really today, and what if more
> > > stuff needed to be added in the future that wasnt flat. For example
> > > the shard handler stuff has nested elements, but i kinda had to guess
> > > at how this mapped to .properties (since xml differentiates between
> > > attributes and elements, but its not so clear with .properties).
> > >
> > > It seems to me there are two changes involved:
> > > 1. ability to auto-discover cores from the filesystem so you don't
> > > need to explicitly list them
> > > 2. changing .xml format to .properties
> > >
> > > I guess just brainstorming, like what if we just kept the existing
> > > .xml format? we could add a new "autoDiscover=true" attribute to the
> > > cores element for people who don't want to list them explicitly.
> > > Additionally we could make the file optional (in case you have no need
> > > to configure anything in it). This might be less of an intrusive
> > > change for 4.x too, as its a less radical change, just a new attribute
> > > users can test if they want. Maybe we make it the default later in 5.x
> > > but in general the format would still be the same, and we wouldnt need
> > > the 2 different config implementations.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to