Re: [Sugar-devel] [PATCH sugar-datastore] Ensure we return valid internal / calculated properties
Excerpts from Simon Schampijer's message of 2012-01-10 14:19:59 +0100: It would be good to add here information what the patch does, that it adds the uid and filesize to the metadata but does not write it to disk etc. How about this additional paragraph: We now determine and fill in the 'filesize' and 'uid' properties in find() and get_properties(), instead of relying on the on-disk copy in .../metadata/{filesize,uid}. [_fill_internal_props()] +file_path = self._file_store.get_file_path(uid) +if os.path.exists(file_path): +stat = os.stat(file_path) +metadata['filesize'] = str(stat.st_size) +else: +metadata['filesize'] = '0' Why do you make the filesize a string here? In all the other cases in the code it is an int e.g.: Because in the other cases we store it on disk afterwards, converting it to a string on the way. (Binary) strings are the only type supported by sugar-datastore = 0.84, everything else gets converted somewhere. Sascha -- http://sascha.silbe.org/ http://www.infra-silbe.de/ signature.asc Description: PGP signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [PATCH sugar-datastore] Ensure we return valid internal / calculated properties
Hi Sascha, thanks for the patch. On 02/11/11 23:21, Sascha Silbe wrote: The copy in the metadata storage can get corrupted, e.g. due to low level crashes or running out of battery (see OLPC#11372 [1] for a real-life example). This is especially problematic for the uid property, since without it the caller (i.e. the Journal) can't even figure out which entry to delete. [1] https://dev.laptop.org/ticket/11372 It would be good to add here information what the patch does, that it adds the uid and filesize to the metadata but does not write it to disk etc. Reported-by: Gary Martingarycmar...@googlemail.com Signed-off-by: Sascha Silbesi...@activitycentral.com --- Passes the test suite - at least after fixing the latter to accept decimal strings instead of just integers for the 'filesize' property. No difference in performance noticeable with three runs of the test suite (default settings, on my desktop machine). src/carquinyol/datastore.py | 20 1 files changed, 20 insertions(+), 0 deletions(-) diff --git a/src/carquinyol/datastore.py b/src/carquinyol/datastore.py index 4f3faba..eafc8e1 100644 --- a/src/carquinyol/datastore.py +++ b/src/carquinyol/datastore.py @@ -333,6 +333,7 @@ class DataStore(dbus.service.Object): return self._find_all(query, properties) metadata = self._metadata_store.retrieve(uid, properties) +self._fill_internal_props(metadata, uid, properties) entries.append(metadata) logger.debug('find(): %r', time.time() - t) @@ -350,10 +351,28 @@ class DataStore(dbus.service.Object): entries = [] for uid in uids: metadata = self._metadata_store.retrieve(uid, properties) +self._fill_internal_props(metadata, uid, properties) entries.append(metadata) return entries, count +def _fill_internal_props(self, metadata, uid, names=None): +Fill in internal / computed properties in metadata + +Properties are only set if they appear in names or if names is +empty. + +if not names or 'uid' in names: +metadata['uid'] = uid + +if not names or 'filesize' in names: +file_path = self._file_store.get_file_path(uid) +if os.path.exists(file_path): +stat = os.stat(file_path) +metadata['filesize'] = str(stat.st_size) +else: +metadata['filesize'] = '0' Why do you make the filesize a string here? In all the other cases in the code it is an int e.g.: if os.path.exists(file_path): stat = os.stat(file_path) props['filesize'] = stat.st_size else: props['filesize'] = 0 I think we should be consistent here. Regards, Simon ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [PATCH sugar-datastore] Ensure we return valid internal / calculated properties
Excerpts from James Cameron's message of 2011-11-02 23:42:59 +0100: I'm glad to see fixes for corrupted metadata, but I wonder if the application (the datastore process) is properly asking the kernel to retain the data? I wouldn't like to see it block on a sync, but I thought there were ways to ask for the data to be written soon. We're not using fsync() or sync() because we expect serious performance regressions if we do (since the entire data store would block on the fsync() call, not just a single API call). The current data store layout splits everything over lots of different files which we'd have to fsync() individually. If you know of a better way to improve reliability, I'm all ears. Neither madvise() nor posix_fadvise() do what we need. sync_file_range() isn't applicable to newly-created files. FWIW, if fsync() were fast enough, the algorithm I'd use with the current data store layout would be: 1. Clear index updated flag. 2. Create a temporary directory for the new entry, keeping the file descriptor open. 3. Write all metadata, keeping the file descriptors open. 4. Write the data, keeping the file descriptor open. 5. fdatasync() and close all metadata files. 6. fdatasync() and close the data file. 7. fsync() and close the temporary directory. 8. rename() the new entry (temporary directory) into place. 9. Update the Xapian index. 10. Set index updated flag. 11. Signal completion to caller. The intention of the interleaved FIFO order for the writes and fsync()s would be to let the kernel write as much data as possible to the disk in the background and only wait for completion when there's nothing else for us to do. With the above sequence (and assuming no data is lost inside the storage device, e.g. due to integrated write caches), the data store would provide at least three of the four ACID properties: Atomicity (final rename()), Consistency (fsync() + final rename()), Durability (signalling completion after everything has reached permanent storage). It might even be considered having the Isolation property, but I'd have to think some more about that. Sascha -- http://sascha.silbe.org/ http://www.infra-silbe.de/ signature.asc Description: PGP signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [PATCH sugar-datastore] Ensure we return valid internal / calculated properties
Yes, blocking on fsync() would be bad. How about a fork-and-fsync? I wonder how long the kernel is leaving the filesystem in this state? -- James Cameron http://quozl.linux.org.au/ ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
[Sugar-devel] [PATCH sugar-datastore] Ensure we return valid internal / calculated properties
The copy in the metadata storage can get corrupted, e.g. due to low level crashes or running out of battery (see OLPC#11372 [1] for a real-life example). This is especially problematic for the uid property, since without it the caller (i.e. the Journal) can't even figure out which entry to delete. [1] https://dev.laptop.org/ticket/11372 Reported-by: Gary Martin garycmar...@googlemail.com Signed-off-by: Sascha Silbe si...@activitycentral.com --- Passes the test suite - at least after fixing the latter to accept decimal strings instead of just integers for the 'filesize' property. No difference in performance noticeable with three runs of the test suite (default settings, on my desktop machine). src/carquinyol/datastore.py | 20 1 files changed, 20 insertions(+), 0 deletions(-) diff --git a/src/carquinyol/datastore.py b/src/carquinyol/datastore.py index 4f3faba..eafc8e1 100644 --- a/src/carquinyol/datastore.py +++ b/src/carquinyol/datastore.py @@ -333,6 +333,7 @@ class DataStore(dbus.service.Object): return self._find_all(query, properties) metadata = self._metadata_store.retrieve(uid, properties) +self._fill_internal_props(metadata, uid, properties) entries.append(metadata) logger.debug('find(): %r', time.time() - t) @@ -350,10 +351,28 @@ class DataStore(dbus.service.Object): entries = [] for uid in uids: metadata = self._metadata_store.retrieve(uid, properties) +self._fill_internal_props(metadata, uid, properties) entries.append(metadata) return entries, count +def _fill_internal_props(self, metadata, uid, names=None): +Fill in internal / computed properties in metadata + +Properties are only set if they appear in names or if names is +empty. + +if not names or 'uid' in names: +metadata['uid'] = uid + +if not names or 'filesize' in names: +file_path = self._file_store.get_file_path(uid) +if os.path.exists(file_path): +stat = os.stat(file_path) +metadata['filesize'] = str(stat.st_size) +else: +metadata['filesize'] = '0' + @dbus.service.method(DS_DBUS_INTERFACE, in_signature='s', out_signature='s', @@ -376,6 +395,7 @@ class DataStore(dbus.service.Object): def get_properties(self, uid): logging.debug('datastore.get_properties %r', uid) metadata = self._metadata_store.retrieve(uid) +self._fill_internal_props(metadata, uid) return metadata @dbus.service.method(DS_DBUS_INTERFACE, -- 1.7.6.3 ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] [PATCH sugar-datastore] Ensure we return valid internal / calculated properties
I'm glad to see fixes for corrupted metadata, but I wonder if the application (the datastore process) is properly asking the kernel to retain the data? I wouldn't like to see it block on a sync, but I thought there were ways to ask for the data to be written soon. -- James Cameron http://quozl.linux.org.au/ ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel