[Tech] Fcp woes ..final

Matthew Toseland Thu, 20 Mar 2008 21:50:21 +0000

On Thursday 20 March 2008 17:29, juergen urner wrote:
> Message got lost somehow. Second attempt.
> 
> Matthew Toseland schrieb:
> > On Wednesday 19 March 2008 22:48, juergen urner wrote:
> >  
> >> Matthew Toseland schrieb:
> >>    
> >>> On Wednesday 19 March 2008 14:27, juergen urner wrote:
> >>>        
> >>>> Just dropping by to tell, that I finally resign on writing a 
> >>>> highlevel Python wrapper
> >>>> for the protocol. There is still too many show stoppers in there to 
> >>>> write it nice and
> >>>> clean.
> >>>>
> >>>>
> >>>> I'd like to give some feedback on what made my live hard. Maybe 
> >>>> s.o. can make
> >>>> use of it. In no particular order....
> >>>>
> >>>>         
> 
>>> -------------------------------------------------------------------------------------------------------------
>>>  
> >>>
> >>>      
> >>>> [Global queue and persistence]
> >>>> Already reported. Maybe one last note regarding this. If apps feel 
> >>>> the need to share data
> >>>> it should be up to them to organize this.
> >>>>             
> >>> Noted. As you know we disagree.
> >>>        
> >>>> [Persistent requests vs. non persistents]
> >>>> x. one may want to modify or remove non persistent requests aswell
> >>>>     this should eliminate all the case switches if 
> >>>> Persistence==connection else...
> >>>> x. there should be an explicite way to stop, resume all kinds of 
> >>>> requests
> >>>>             
> >>> I agree, I just haven't got around to dealing with it yet. There 
> >>> didn't       
> > seem  
> >>> to be much demand.
> >>>        
> >>>> [DataFound vs. AllData]
> >>>> Another case switch that looks rather unnecessary. AllData is not 
> >>>> send for persistent
> >>>> requests.             
> >>> Not true. AllData is sent with the data, for any request where 
> >>> returntype=direct. We don't send the data for persistent requests 
> >>> until       
> > the  
> >>> client asks for it.        
> >> Sorry, sloppy tying. For persistents you have to explicitely query it 
> >> by sending
> >> GetRequestStatus. To me it looks like a special case that breaks 
> >> consistency.
> >>     
> >
> > You mean we should always force the client to do GetRequestStatus? 
> > What's your proposed solution?
> >   
> 
> I don't know what exactly the node does for persistent requests. Buffer 
> data somewhere?


Depends on what you ask it to do. You can get it to write it direct to disk, 
which is the most efficient in terms of disk space, or you can have it store 
it internally in an encrypted temporary file.

> What can the node do better than client here?

Save a lot of disk space, run requests when the client isn't running, lots.
> 
> I'd Pass it along with AllData. It is assumed to be of reasonable size 
> anyways. 

We always pass the data with AllData. It's a question of when we should send 
AllData. At the moment, for persistent requests, we send AllData only when 
asked to; we send DataFound when the data is found and when the client 
connects.

> If you 
> don't need it right away, feel free to ignore it. Otherwise clients are 
> broken
> relying on data automatically passed on AllData for Persistence=connection.

AFAIK AllData always includes data.
> 
> >>>> Looks loke a very special case to me, maybe some Frost. Imo,  if there
> >>>> is a need to split data fetching and         
> > whatever-client-wants-to-do-with-data
> >  
> >>>> apart, doing it explicitely could be much clearer. Something like...
> >>>>
> >>>> client --> node
> >>>> GetData
> >>>> End
> >>>>
> >>>> node-->client
> >>>> DataFound
> >>>> End
> >>>>
> >>>> client-->node
> >>>> GetDataAs
> >>>>     Filename=
> >>>>     Direct=
> >>>> End
> >>>>         
> >>> This is pretty much what we do, with the exception that you have to 
> >>>       
> > specify  
> >>> where to write the data to on starting the request, and have to do a 
> >>> DDA verification.
> >>>       
> >
> > I'm not entirely averse to the possibility of doing a fetch with 
> > returntype=direct, and then telling the node where to write the data 
> > to after it's been fetched. However, it's more efficient (disk space) 
> > to tell the node in advance where the data should go.
> >   
> Another don't-know-what-exactly-the-node-is-doing. Decode on the fly to 
> the desired
> location? Buffer into persistent-tmp if necessary?

*Any* request will at some point have to be buffered in temporary space by the 
node.

You're not seriously arguing that clients should do everything themselves? 
That would make clients much more complex and introduce no end of 
compatibility problems at the same time.
> 
> >>>> [SimpleFiledSet and Fcp]
> >>>> This is a nasty one: dotted names.  It may be easy or not for some 
> >>>> languages to deal with
> >>>> this data type. Actually it is easy in python, but doing additional 
> >>>> processing based on
> >>>> message signature (like type conversions) is pretty painful and 
> >>>> costy. Flat messages
> >>>> shopuld be way easier to handle. Already made a suggestion 
> >>>> regarding         
> > this...
> >  
> >>> I really don't understand what the difference is. We provide a 
> >>> Count. If       
> > you  
> >>> are parsing into a tree structure, it's trivial. Even if you're not, 
> >>>       
> > you've  
> >>> got the Count, so you can allocate the array before you parse the       
> > sub-items.  
> >>> Given a multiplexable protocol, using separate messages would mean 
> >>> having       
> > an  
> >>> identifier in each one which would be just as messy. The only reason 
> >>> to do       
> > it  
> >>> the way you propose would be to avoid very large messages.
> >>>       
> >> Thank God no pre-allocation here :-)
> >>     
> >
> > :)
> >  
> >> ...an identifier would be required, came to my mind too. Not nice.
> >> But you don't have to multiplex here if the number of subitems can
> >> be assumed to be small enough for clients to deal with at once.
> >
> > In which case they are submessages rather than messages. In which case 
> > why not simply include them in the parent message?
> >   
> ...but easier to parse and handle + leave door open for multiplexing.
> Sorry for plugin holes into SimpleFieldSet. You pointed me to it, remember?

Marginally easier to parse. You have the values first, then you have the 
sub-fieldsets. A sub-fieldset is included in its entirety before moving on to 
the next one. How is this different to having separate messages?
> 
> >>>> ContainerMessage
> >>>>     NItems=N
> >>>> End
> >>>> Item
> >>>> End
> >>>>
> >>>> [Message signatures and consts]
> >>>> (already mentioned before) a good to have would be a  file containg 
> >>>> all consts and message
> >>>> signatures used in Fcp.         
> >>> You mean error codes? Or what?        
> >> <PseudocodePython>
> >>
> >> class Message:
> >>     name = 'Foo'
> >>     Fields = {
> >>     'Bar': DataTypeBool,
> >>     }
> >>
> >> ErrorCode = 1
> >> Const = 1
> >>
> >> </PseudocodePython>
> >>
> >> This is could be fed to a dedicated parser, saving much work
> >> for any protocol wrapper in any language.
> >
> > Ahhh. Well, patches are welcome. So are bug reports (on the bug tracker).
> >  
> >>>        
> >>>> It should go into very detail regarding types  aswell. 
> >>>> TypeIPAddress. TypeIPAddressList (...)
> >>>> This would save much typing work and could help in autogenarating 
> >>>> huge parts of protocol wrappers.
> >>>>         
> >>> I don't understand.
> >>>      
> >>>> [Organize requests hirarchically]
> >>>> This may sound exotic, but it could help in organizing requests. 
> >>>> Especially when it comes to restoring
> >>>> requests, where it may be desireable to process requests on demand. 
> >>>> Something like
> >>>> GetRootRequest ..ListChildRequests
> >>>>         
> >>> It *is* exotic.        
> >> Probably you are right.
> >>     
> >
> > I thought you didn't like hierarchy? You seem to have a lot of trouble 
> > with it above.
> 
> I love hierarchies. That is, plugin holes into them :-)
> 
> >>>> [Reserve DataLenght as indicator of data attatched to a message]
> >>>> (Already mentioned) makes message parsing much easier if one can 
> >>>> rely on this field being
> >>>> reserved as message metadata.
> >>>>         
> >>> What else can it mean? DataLength always means the data is the given 
> >>>       
> > length.  
> >>> And if the message ends with Data, there is data attached of that 
> >>> length. IIRC we may use DataLength in some places where there is no 
> >>> data attached, but in that case we don't end with Data.
> >>>       
> >> if ...else
> >>     
> >
> > Right, but do you really want DataFound to not tell you what the 
> > length of the data is?
> 
> The more metadata the better. According to your naming scheme I'd expect
> Metadata.DataLength as opposed to DataLength.

File a bug. Although the costs of changing it would at this point be quite 
high.
> 
> >>>> [ConfigData, NodeData, Peer should be traversable]
> >>>> See [Organize requests hirarchically]
> >>>>         
> >>> Whatever. Very few apps use that stuff.        
> >> Mine does.
> >
> > What do you mean by traversible? They should be multiplexible, in that 
> > they should have an Identifier.
> >  
> >>>> [Drop Started field in Persistent* messages]
> >>>> Maybe use simple progress instead, and / or some dedicated field to 
> >>>> indicate request status.
> >>>> This could be helpful anyways if requests may be stopped, resumed 
> >>>> (...) at some point
> >>>>         
> >>> Won't fix. If you don't like the field, ignore it.        
> >> def restoreClientGet(msgPersistentGet):
> >>     if 'Started' in msgPersistentGet.fields:
> >>         del msgPersistentGet.fields['Started']
> >>     msgClientGet = ClientGet(fields=msg.fields)      
> >
> > LOL why?
> 
> Being pedantic as only python coders can get... otherwise I'd have to leave
> a note in the docs explaining the purpose of this field.
> 
> >>>> [Dedicated Result field in messages]
> >>>> In case something goes wrong, it would be nice to have the message 
> >>>> casing the error
> >>>> at hand right away. ++ ProtocolError, FetchError, InsertError are 
> >>>> not very handy. They
> >>>> should be numbered consecutively. I found myself writing a wrapper 
> >>>> to emulate this
> >>>> (uniformity).
> >>>>         
> >>> You mean the different error code types should be in different 
> >>> numerical ranges?
> >>>         
> >> Yes. Afaics for clients it makes no difference if an error is a 
> >> protocol error
> >> or a fetch error.
> >>
> >> ProtocolErrorMin = 0
> >> ProtocolErrorMax = 1000
> >> FetchErrorMin = ProtocolErrorMax + 1
> >> (...)
> >
> > A protocol error indicates a bug in the library, or something that the 
> > library can deal with (e.g. you need to do DDA verification). A fetch 
> > error indicates a Freenet level problem. A client should never see a 
> > protocol error ... I suppose that's not strictly true any more... DDA 
> > stuff you can generally avoid, since you should be handling that 
> > transparently in the library, but if the client passes you a malformed 
> > URL, you need to pass that back to the client; if it tries to access 
> > an unknown plugin, peer with itself etc, you may need to pass it back 
> > ... hmmm. At this point, some clients may be relying on error codes, 
> > so I don't really think changing it is a good idea...
> 
> Try TestDDA --> fails --> inform user. Thought I bring this up for 
> consideration.

You might inform the user, or you might (for a non-persistent request) fall 
back to doing a ReturnType=direct request and then writing it yourself. OTOH 
if the error simply indicates that you need to do the DDA test, you'd handle 
it inline.
> 
> >>>> [ClientToken field for all messages]
> >>>> It may be desireable to keep track of all messages and responses. 
> >>>> Like for example GetConfig
> >>>> and Config to tell when all tasks a user has posted are completed. 
> >>>> ClientToken would come in
> >>>> handy.
> >>>>         
> >>> I agree that some messages need an Identifier. But please file 
> >>> specific       
> > bugs,  
> >>> or better provide specific patches.        
> >> No Java for me. Off cause you could embedd a python interpreter ;-)
> >> Rule of thumb is: it is useful in all messages.
> >
> > Right. Well I will get around to reviewing them eventually.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20080320/0e0790af/attachment.pgp>

[Tech] Fcp woes ..final

Reply via email to