[freenet-dev] Refactoring Freenet and Library was Re: Gun.IO and Freenet

Matthew Toseland Thu, 22 Mar 2012 12:00:03 +0000

On Wednesday 21 Mar 2012 00:30:42 Ximin Luo wrote:
> "Implicit dependency" means that changing some code in one area results in a
> logical error in the other, that is not immediately apparent in the
> documentation nor any compile-time checking built into the language. Obviously
> this is not necessarily a bad thing, but there is a lot of it in the config
> system, and elsewhere, including in how the configured values are used.


Okay...
> 
> For example, for the longest time there was no centralised management of
> run-time files used by Freenet, and even now they are just a bunch of extra
> fields in Node.java because I had no existing structure to fit such a system
> into. However, Node.java is such a fucking disorganised mess, with several
> hundred fields, that it's not immediately obvious to everyone that they should
> be using "new File(Node.nodeDir(), fileName)" rather than "new 
> File(fileName)".
> I cleaned it up when I first wrote that part, but it looks like misuses have
> slowly crept in again.

So your complaint is that there is no single central point to put all the 
folder names in? Feel free to create one.

Making stuff static is tempting but means we can't do multi-nodes-in-one-VM 
tests, although I believe classloader hacks can solve that? Is there much of a 
performance cost? Otherwise we can just use a global object and grab it from 
Node, or ClientContext.
> 
> Yes, configs have EVERYTHING to do with "writing the values out". All the
> read/write logic is wrapped up into a ConfigCallback (which throws a
> InvalidConfigException) which is ONLY PASSED TO THE CONFIG SUBSYSTEM. Even
> ignoring the obvious intention of whoever named these classes, the target of
> the read/write logic is completely detached from the logic itself.

ConfigCallback doesn't do any writing out. And the getters don't throw. It can 
only throw when a value is changed. So it's still not clear what the problem is.
> 
> Your explanation about the ObjectContainer also assumes a severely stunted
> events framework can't support fine grained thread control. Why can't there, 
> in
> your imagination, exist SOMEWHERE, a framework that lets you mandate "only run
> jobs of type J in thread T"?

And those jobs are called with parameters (ObjectContainer, ClientContext) ? 
Since we'll have to subclass it anyway, it's going to be more messy than the 
current solution, isn't it? Anyway DBJob/DBJobRunner is pretty simple (although 
it's hidden in NodeClientCore along with ClientContext initialisation which is 
kind of messy).

I agree we should make *Executor use the standard APIs though - while keeping 
the ability to change thread priorities.
> 
> An events framework could handle errant plugins by setting a timeout for all
> blocking jobs. If they don't complete within X time, interrupt the thread
> and/or kill it if it doesn't respond to the interrupt. If the plugin can't
> handle it, tough shit their fault. (This is how certain companies run reliable
> services on a massive scale.)

Can't be done. Period.

It's Java. Java does not support killing threads. According to the 
documentation the reason for this is that killing a thread might cause 
synchronization problems. Which is true, but I would have thought it was 
solvable ... Anyway if you need to be able to kill stuff it has to run in *a 
separate JVM*.
> 
> As for the updater, simply seeing where the currently-running JARs are, is not
> enough, if an update introduces extra JARs as a dependancy. It's necessary to
> have an explicit list of all the dependencies. Unfortunately it's not enough 
> to
> simply use the Class-Path attribute of the manifest of freenet.jar, because
> that hard-codes the path to the other JARs, whereas we need it to be variable
> for packaging purposes. So, this information can only be figured out at
> *package time* (for arbitrary package layouts), and the updater needs a
> mechanism to read this information, and probably update that as well.

You are planning to have a packaged version which is able to 
auto-update-over-Freenet?

Currently class-path hardcoded freenet-ext.jar for convenience more than much 
else. 99% of installs use the wrapper.
> 
> X
> 
> On 20/03/12 23:18, Matthew Toseland wrote:
> > On Monday 19 Mar 2012 01:52:19 Ximin Luo wrote:
> >> On 16/03/12 18:13, Matthew Toseland wrote:
> >>> On Thursday 15 Mar 2012 21:02:26 Ximin Luo wrote:
> >>>> (Top-posting because previous post is too long to reply to in-line.)
> >>>>
> >>>> ## refactoring
> >>>>
> >>>> Refactoring the code helps to make it more understandable for other 
> >>>> coders. One
> >>>> of the reasons why I don't work more on freenet myself is because it's
> >>>> extremely difficult to make any changes that are more than a simple bug 
> >>>> fix.
> >>>
> >>> Hmmm, good point. But from FPI point of view, unless it's a very quick 
> >>> and high impact refactoring,  it's probably too long term.
> >>>>
> >>>> When I was writing code for the config subsystem to separate out 
> >>>> different
> >>>> files into different directories, this was very frustrating and took up 
> >>>> much
> >>>> more time than I expected.
> >>>
> >>> I'm still curious as to why.
> >>
> >> The code is spread out over many files not related to config management; 
> > 
> > It is? How so?
> > 
> > The code for handling the config files simply takes a value and applies it 
> > to that subsystem. Short of converting it to get*/set*'ers, using 
> > reflection, I don't see how we could make it much simpler.
> > 
> >> this
> >> creates lots of implicit dependencies, which is not directly apparent in 
> >> the
> >> syntax/structure of the code nor the semantics of the languag. In other 
> >> words,
> >> spaghetti code.
> > 
> > I still don't follow - implicit dependancies?
> >>
> >> Config management code should not be mixed into the thing they are 
> >> configuring,
> >> this is just Good Programming. 
> > 
> > I don't get it. Evidently I didn't take that course.
> > 
> >> Changing variables on-the-fly is a completely
> >> separate issue from config management and actually writing out those 
> >> values,
> >> and it was a mistake to bundle the two together in the code.
> > 
> > The code that is "spread out over many files" has nothing to do with 
> > writing the values out. All it does is changes the values. Sometimes it is 
> > necessary to store a copy of the configured value so that it can be 
> > returned exactly, most of the time it just gets and sets a variable.
> >>
> >> Using the guice framework would help this a lot as well.
> > 
> > I am not convinced.
> >>
> >>>>
> >>>> NB: I'm not implying you should be doing all the work toad, please don't 
> >>>> think
> >>>> this. And please don't think that my opinions are automatically invalid 
> >>>> / less
> >>>> worthy just because I haven't committed code for ages. I just want to 
> >>>> express
> >>>> my view of What I Would Do If I Were God.
> >>>
> >>> Okay. :) I don't mean to criticise un-constructively, just have a useful 
> >>> conversation.
> >>>>
> >>>> People can do what they want but it doesn't mean it's a good idea. 
> >>>> Granted, I
> >>>> consider "code quality" independently of any issues such as funding, but
> >>>> focusing too much on the latter leads to bad software. If there's 
> >>>> pressure like
> >>>> this, ignore it, find something else in the meantime until it goes away.
> >>>
> >>> I plan to continue working for FPI for the time being. Partly because 
> >>> it's part time and fits conveniently with studying. Partly because 
> >>> getting a programming job in the UK requires a degree even if you have 
> >>> some rather odd experience.
> >>>>
> >>>> ## freenet Events API
> >>>>
> >>>> freenet does not provide a great API for running callbacks and scheduled 
> >>>> tasks.
> >>>
> >>> I'm not sure what you mean here. For example, the client layer has unique 
> >>> requirements. We simply can't just change it to use java.util.concurrent. 
> >>> Database jobs must run on the database thread, and no other thread may 
> >>> have access to the database, because the client layer is not designed to 
> >>> deal with simultaneous transactions. This avoids performance problems 
> >>> (separate caches for each client; simultaneous disk I/O), and complexity 
> >>> (refreshing everything constantly, which requires a more query-oriented 
> >>> architecture; we'd have to change pretty much everything, and it'd 
> >>> probably be slower). But it's messy, in particular it requires passing 
> >>> around ObjectContainer's.
> >>
> >> Why can't these restrictions be handled by a 3rd-party events framework?
> > 
> > They can't. Period.
> > 
> > The long answer is you can't pass an ObjectContainer to any other thread, 
> > and any DBJob needs both the ObjectContainer and the ClientContext (which 
> > isn't stored in any persistent class and is effectively the means to get 
> > all the objects created for this run of the node that aren't persisted 
> > otherwise, plus some global stuff like the FEC queue).
> > 
> > The simplest way to enforce this rule is simply the ugly convention of 
> > always passing the ObjectContainer in, and NOT either making it a singleton 
> > global or storing it anywhere.
> > 
> > The expensive way to deal with this would involve even more unwritten 
> > rules: You could write a wrapper for ObjectContainer that throws if you try 
> > to access it from another thread, for example.
> > 
> > The really expensive way to deal with this is to have parallel 
> > transactions. However, this way lies madness, because:
> > 1. Typical end-user PCs' disks may be slower rather than faster with more 
> > parallel transactions.
> > 2. More importantly, there is *one cache per transaction*.
> > 3. Code-wise, we would have to refresh everything every time we use it. 
> > Although we already have a lot of activation and deactivation, so maybe 
> > that's less of an issue. We would need to reconsider the "roots", and 
> > probably make everything start with a query. Db4o queries generally are 
> > rather slow so this would probably be either slow or messy (e.g. creating 
> > key objects using strings etc, like p0s does in Freetalk and WoT).
> > 4. We would also have to deal with rollbacks. Arguably these are a good 
> > thing - p0s uses them - but that would mean that some easy optimisations 
> > because extremely difficult. Once these issues are solved for Freetalk/WoT 
> > (e.g. by means of a block level cache capable of caching more than one 
> > transaction and then writing them to an alternate file, and alternating 
> > between the two), it might be worth considering, but I doubt it would be 
> > worth it.
> >>
> >>>> Converting everything to use java.util.concurrent and/or
> >>>> com.google.common.util.concurrent would help this a lot. Of course, 
> >>>> Library can
> >>>> and currently does implement this itself, but it's fairly sloppy, and 
> >>>> other
> >>>> plugins would benefit if such a framework were provided.
> >>>
> >>> If you simply mean replacing Ticker and Executor, that's fine by me.
> >>>>
> >>>> Some common tasks that plugins would like to do, which really should be
> >>>> provided by the underlying application:
> >>>> - run tasks in the background
> >>>> - run tasks according to a particular schedule
> >>>> - cancel running tasks
> >>>
> >>> How do you propose to cancel a task once it's started? I guess it depends 
> >>> what sort of task it is. If it has a boolean and periodically checks it 
> >>> then fine; this would require a subclass ...
> >>
> >> Depends on what the task is. The writer of the "task" will need to add 
> >> code to
> >> support cancellation, just like Future.cancel().
> > 
> > Okay.
> >>
> >>>> - handle dependencies between tasks, so that e.g. if A depends on B and I
> >>>> cancel B, A is automatically cancelled.
> >>>
> >>> That's a nice bit of functionality yeah. Nothing in fred needs it at 
> >>> present, although radical refactorings might make it more useful.
> >>
> >> Having this functionality would make Library's code a lot simpler. (The
> >> dependency management that is, not necessarily the cancellation.) We talked
> >> about it a while back but I never really sat down to solve the issue.
> > 
> > I have no problem with importing a small third party library or even having 
> > such classes in freenet/support/.
> >>
> >>>> - group tasks into related categories so one group doesn't affect the 
> >>>> other.
> >>>> (e.g. tasks for the group "Library/b-tree-write/index-<URL>" won't 
> >>>> starve the
> >>>> resources of group "WoT/background-trust-calculation")
> >>>
> >>> If you are scheduling tasks in a limited pool this makes sense. Freenet 
> >>> generally doesn't do this because most tasks are either CPU intensive and 
> >>> complete quickly, or are blocking / I/O intensive and take ages but don't 
> >>> use much resources. Also many tasks are latency sensitive. And on modern 
> >>> JVMs, lots of threads are cheap, although memory is an issue, we do need 
> >>> to keep it down if possible ...
> >>
> >> There is a thread limit. You don't want plugin A to hog 90% threads and 
> >> leave
> >> the other 10% to everything else.
> > 
> > It's different for plugins IMHO. Although there is only so much you can do 
> > to guard against errant plugins.
> >>
> >>>>
> >>>> ## config management
> >>>>
> >>>> Code for read/writing the config in embedded inside the classes they 
> >>>> control,
> >>>> which clutters up the code and makes it confusing. Config code should be
> >>>> separated from the actual application. It would also be nice if this was
> >>>> exposed to plugins.
> >>>
> >>> Separated how exactly? You want to call getters via reflection, like java 
> >>> beans? It is always going to be the case that changing some settings on 
> >>> the fly will be somewhat involved and require methods for it; the classes 
> >>> are only one way to write such an interface, I'm open to other mechanisms 
> >>> given that config happens very infrequently.
> >>>>
> >>>> Being a daemon is why the current config system is NOT SUFFICIENT. I 
> >>>> need to be
> >>>> able to lock certain config settings, such as where to keep the 
> >>>> datastore /
> >>>> run-time files themselves, to conform to the FHS. It's best that such 
> >>>> settings
> >>>> are kept in a separate read-only file. The current system only reads one 
> >>>> file,
> >>>> freenet.ini.
> >>>
> >>> Sounds to me like that could be achieved with some fairly small changes; 
> >>> I still don't see what the problem is.
> >>
> >> Adding an "#include" mechanism for config files whilst supporting "write" 
> >> at
> >> the same time is not a "small change". Think about it.
> > 
> > There are several cheap and dirty options, for example, we could provide a 
> > list of config files on the command line, with changes being written to the 
> > first (or the last, possibly configurable with another command line option).
> > 
> > But yes, maybe we need something more elegant.
> > 
> > But *we need to be able to write to a config file*. Ease of use on the 
> > majority platform (Windows) makes this an absolute requirement. We cannot 
> > get rid of it, period.
> >>
> >> (This is why .gitconfig doesn't have #include whereas .hgrc does.)
> >>
> >>>> Updating its own binaries is not a problem if freenet knows where the 
> >>>> binaries
> >>>> are. The current installer puts them in a rigid place, however this is
> >>>> incompatible with FHS.
> >>>
> >>> A rigid place? I don't follow. If you are doing FHS you are using a 
> >>> package; the installer is never going to comply with FHS as, apart from 
> >>> anything else, it shouldn't be run as root!
> >>
> >> The updater (UpdateDeployContext) makes some very specific assumptions as 
> >> to
> >> the locations of the jars, and this method doesn't generalise to non-jar 
> >> files.
> >> It also assumes wrapper.conf is in the current directory.
> > 
> > I believe we can ask the wrapper where the file is - and probably the jars 
> > too.
> > 
> > However I thought you were expecting to just turn off auto-update in 
> > packaged versions?
> >>
> >>> Updating its own binaries is incompatible with the standard unix way of 
> >>> doing things, isn't it? Even if it's not technically a violation of FHS?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20120322/4f693b48/attachment.pgp>

[freenet-dev] Refactoring Freenet and Library was Re: Gun.IO and Freenet

Reply via email to