Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path
As far as I can tell though that won't actually fix the issue triggering rebuilds in your case. Something else must be changing. It would be interesting if you could dig into what that might be as I can't reproduce the rebuilding issue here. I discovered that different versions of Dylan from git and the tarball were causing the rebuilds to occur. Once I had the exact same version of dylan, the entire build was pulled from sstate-cache. My understanding now is that the recipe file dependencies changed for some recipes. Therefore the sorted checksums were modified triggering a rebuild. As I said earlier, if you look at the code only the checksum value goes into the task checksum and not the filename, so the path to the base of the metadata changing cannot influence the task checksum. However, I think I can see how you would think that the paths changing influenced the checksum, since the data that we put into the siginfo file *does* contain the full path, and thus bitbake-diffsigs will report that as having changed even though that difference didn't change the task checksum. That is a bug and we should fix how we store paths either in the cache or in the siginfo file, but that fix will have to take into account different layer paths. I'd suggest a simpler and more reliable approach would be to just get the relative path to os.path.dirname of the recipe rather than TOPDIR. The full paths were a little confusing. I bet that the checksum differences (or other changes) were at the bottom of the diffsigs output. I have not had a chance to look at this again closely. When I do, I'll check out your suggestions. - Jate S. ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path
Hi Jate, On Wednesday 19 June 2013 13:14:54 Jate Sujjavanich wrote: Paul Eggleton wrote on Wednesday, June 19, 2013 at 11:45 AM: Actually, looking more closely at this I'm not sure how the full path to the file would be getting into the signature - looking at lib/bb/siggen.py it should only be adding the file checksum value to the signature data and not the path. I did a quick test with master by moving some files referred to in SRC_URI to a different valid location (thus changing their full path), cleaning the recipe and then building it again, and the output was restored from sstate rather than rebuilding. Can you explain how you came to the conclusion that this was why the checksums were different on different machines? On the same machine, I had two identical poky directories. One, I built first to use as the cache source. I then configured the second one to use a SSTATE_MIRRORS pointing to the first. The build of the second would just detect changes and force a rebuild. Running the bitbake-diffsigs tool on the do_fetch...sigdata showed something like the following on all of the local files: Removed(/home/user/poky1/meta/recipes...file1, chksum1) Added(/home/user/poky2/meta/recipes...file1, chksum1) Upon applying my patch, the components of core-minimal-image were pulled from the sstate-cache. I just got around to trying this again, only this time I cloned a new poky tree in a different path which should replicate your situation and used SSTATE_MIRRORS to point to the previous sstate-cache. It built completely from sstate as it expected. For reference I was testing with the latest dylan branch and core-image-minimal. As I said earlier, if you look at the code only the checksum value goes into the task checksum and not the filename, so the path to the base of the metadata changing cannot influence the task checksum. However, I think I can see how you would think that the paths changing influenced the checksum, since the data that we put into the siginfo file *does* contain the full path, and thus bitbake-diffsigs will report that as having changed even though that difference didn't change the task checksum. That is a bug and we should fix how we store paths either in the cache or in the siginfo file, but that fix will have to take into account different layer paths. I'd suggest a simpler and more reliable approach would be to just get the relative path to os.path.dirname of the recipe rather than TOPDIR. As far as I can tell though that won't actually fix the issue triggering rebuilds in your case. Something else must be changing. It would be interesting if you could dig into what that might be as I can't reproduce the rebuilding issue here. Cheers, Paul -- Paul Eggleton Intel Open Source Technology Centre ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path
On Tue, Jul 16, 2013 at 6:28 PM, Paul Eggleton paul.eggle...@linux.intel.com wrote: I just got around to trying this again, only this time I cloned a new poky tree in a different path which should replicate your situation and used SSTATE_MIRRORS to point to the previous sstate-cache. It built completely from sstate as it expected. For reference I was testing with the latest dylan branch and core-image-minimal. As I said earlier, if you look at the code only the checksum value goes into the task checksum and not the filename, so the path to the base of the metadata changing cannot influence the task checksum. However, I think I can see how you would think that the paths changing influenced the checksum, since the data that we put into the siginfo file *does* contain the full path, and thus bitbake-diffsigs will report that as having changed even though that difference didn't change the task checksum. That is a bug and we should fix how we store paths either in the cache or in the siginfo file, but that fix will have to take into account different layer paths. I'd suggest a simpler and more reliable approach would be to just get the relative path to os.path.dirname of the recipe rather than TOPDIR. As far as I can tell though that won't actually fix the issue triggering rebuilds in your case. Something else must be changing. It would be interesting if you could dig into what that might be as I can't reproduce the rebuilding issue here. on a related topic, we have given a 'hands on workshop' last week at our Linaro Conference. in order to avoid potential network issues and wasting too much time building, we had prepared 'tarballs' for 1) download folder, 2) sstate For sstate we have prepared a couple of such archive for OpenSuse and Ubuntu (12.04, 13.04) for both 32-bit and 64-bit variants. We used 'dylan' in our workshop since the archives were created a couple of days before, to avoid new commits in master. Interesting remarks: - 'download' archive worked nicely for everyone - 'sstate' archive worked for some people, but not everyone. In my particular case I was able to run a complete build from sstate without rebuilding *anything* using the 13.04-64 sstate tarball. but for some people it only used the sstate partially. I didn't get a chance to 'debug' what the problem was. but all our files are available in case some more folks want to play with them. You can download the archives here: http://people.linaro.org/~trevor.woerner/LCE2013/ And the instructions to recreate the 'build env' (e.g. checkout the layers) is described here: https://collaborate.linaro.org/x/NYBm ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path
Hi Jate, On Wednesday 19 June 2013 11:08:10 Jate Sujjavanich wrote: This allows sstate-cache to be shared between builds in different directories. Differences in the full path were triggering a false positive when there were actually no changes. Signed-off-by: Jate Sujjavanich jate.sujjavan...@myfuelmaster.com --- bitbake/lib/bb/fetch2/__init__.py | 14 +- bitbake/lib/bb/siggen.py |3 ++- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py index dd1cc93..7ab44d7 100644 --- a/bitbake/lib/bb/fetch2/__init__.py +++ b/bitbake/lib/bb/fetch2/__init__.py @@ -900,8 +900,7 @@ def get_checksum_file_list(d): return .join(filelist) - -def get_file_checksums(filelist, pn): +def get_file_checksums(filelist, pn, topdir): Get a list of the checksums for a list of local files Returns the checksums for a list of local files, caching the results as @@ -917,7 +916,12 @@ def get_file_checksums(filelist, pn): bb.warn(Unable to get checksum for %s SRC_URI entry %s: %s % (pn, os.path.basename(f), e)) return None return checksum + +(recipe_root, _) = os.path.split(topdir) +def remove_recipe_parent(data): +return data.replace(recipe_root, '').strip('/') + checksums = [] for pth in filelist.split(): checksum = None @@ -927,7 +931,7 @@ def get_file_checksums(filelist, pn): for f in glob.glob(pth): checksum = checksum_file(f) if checksum: -checksums.append((f, checksum)) +checksums.append((remove_recipe_parent(f), + checksum)) elif os.path.isdir(pth): # Handle directories for root, dirs, files in os.walk(pth): @@ -935,12 +939,12 @@ def get_file_checksums(filelist, pn): fullpth = os.path.join(root, name) checksum = checksum_file(fullpth) if checksum: -checksums.append((fullpth, checksum)) + + checksums.append((remove_recipe_parent(fullpth), checksum)) else: checksum = checksum_file(pth) if checksum: -checksums.append((pth, checksum)) +checksums.append((remove_recipe_parent(pth), checksum)) checksums.sort(key=operator.itemgetter(1)) return checksums diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py index 8861337..c64acfe 100644 --- a/bitbake/lib/bb/siggen.py +++ b/bitbake/lib/bb/siggen.py @@ -74,6 +74,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.pkgnameextract = re.compile((?Pfn.*)\..*) self.basewhitelist = set((data.getVar(BB_HASHBASE_WHITELIST, True) or ).split()) self.taskwhitelist = None +self.topdir = data.getVar(TOPDIR, True) self.init_rundepcheck(data) def init_rundepcheck(self, data): @@ -187,7 +188,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.runtaskdeps[k].append(dep) if task in dataCache.file_checksums[fn]: -checksums = bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], recipename) +checksums = + bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], + recipename, self.topdir) for (f,cs) in checksums: self.file_checksum_values[k][f] = cs data = data + cs Good catch! The only thing is, this will not help for files within different layers which may not be underneath TOPDIR; I think we'll need a function that determines which layer the file is under (longest path match from data.getVar('BBLAYERS', True).split()) and then take that path off the beginning. Additionally, this is a patch against bitbake so it will need to go to the bitbake-de...@lists.openembedded.org mailing list. Cheers, Paul -- Paul Eggleton Intel Open Source Technology Centre ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path
On Wednesday 19 June 2013 16:24:53 Paul Eggleton wrote: Hi Jate, On Wednesday 19 June 2013 11:08:10 Jate Sujjavanich wrote: This allows sstate-cache to be shared between builds in different directories. Differences in the full path were triggering a false positive when there were actually no changes. Signed-off-by: Jate Sujjavanich jate.sujjavan...@myfuelmaster.com --- bitbake/lib/bb/fetch2/__init__.py | 14 +- bitbake/lib/bb/siggen.py |3 ++- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py index dd1cc93..7ab44d7 100644 --- a/bitbake/lib/bb/fetch2/__init__.py +++ b/bitbake/lib/bb/fetch2/__init__.py @@ -900,8 +900,7 @@ def get_checksum_file_list(d): return .join(filelist) - -def get_file_checksums(filelist, pn): +def get_file_checksums(filelist, pn, topdir): Get a list of the checksums for a list of local files Returns the checksums for a list of local files, caching the results as @@ -917,7 +916,12 @@ def get_file_checksums(filelist, pn): bb.warn(Unable to get checksum for %s SRC_URI entry %s: %s % (pn, os.path.basename(f), e)) return None return checksum + +(recipe_root, _) = os.path.split(topdir) +def remove_recipe_parent(data): +return data.replace(recipe_root, '').strip('/') + checksums = [] for pth in filelist.split(): checksum = None @@ -927,7 +931,7 @@ def get_file_checksums(filelist, pn): for f in glob.glob(pth): checksum = checksum_file(f) if checksum: -checksums.append((f, checksum)) +checksums.append((remove_recipe_parent(f), + checksum)) elif os.path.isdir(pth): # Handle directories for root, dirs, files in os.walk(pth): @@ -935,12 +939,12 @@ def get_file_checksums(filelist, pn): fullpth = os.path.join(root, name) checksum = checksum_file(fullpth) if checksum: -checksums.append((fullpth, checksum)) + + checksums.append((remove_recipe_parent(fullpth), checksum)) else: checksum = checksum_file(pth) if checksum: -checksums.append((pth, checksum)) +checksums.append((remove_recipe_parent(pth), checksum)) checksums.sort(key=operator.itemgetter(1)) return checksums diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py index 8861337..c64acfe 100644 --- a/bitbake/lib/bb/siggen.py +++ b/bitbake/lib/bb/siggen.py @@ -74,6 +74,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.pkgnameextract = re.compile((?Pfn.*)\..*) self.basewhitelist = set((data.getVar(BB_HASHBASE_WHITELIST, True) or ).split()) self.taskwhitelist = None +self.topdir = data.getVar(TOPDIR, True) self.init_rundepcheck(data) def init_rundepcheck(self, data): @@ -187,7 +188,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.runtaskdeps[k].append(dep) if task in dataCache.file_checksums[fn]: -checksums = bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], recipename) +checksums = + bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], + recipename, self.topdir) for (f,cs) in checksums: self.file_checksum_values[k][f] = cs data = data + cs Good catch! The only thing is, this will not help for files within different layers which may not be underneath TOPDIR; I think we'll need a function that determines which layer the file is under (longest path match from data.getVar('BBLAYERS', True).split()) and then take that path off the beginning. Additionally, this is a patch against bitbake so it will need to go to the bitbake-de...@lists.openembedded.org mailing list. Actually, looking more closely at this I'm not sure how the full path to the file would be getting into the signature - looking at lib/bb/siggen.py it should only be adding the file checksum value to the signature data and not the path. I did a quick test with master by moving some files referred to in SRC_URI to a different valid location (thus changing their full path), cleaning the recipe and then building it again, and the output was restored from sstate rather than rebuilding. Can you explain how you came to the conclusion that this was why the checksums were different on different machines? Cheers, Paul -- Paul Eggleton Intel Open Source Technology Centre ___ Openembedded-core mailing list
Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path
On Wed, Jun 19, 2013 at 04:45:55PM +0100, Paul Eggleton wrote: On Wednesday 19 June 2013 16:24:53 Paul Eggleton wrote: Hi Jate, On Wednesday 19 June 2013 11:08:10 Jate Sujjavanich wrote: This allows sstate-cache to be shared between builds in different directories. Differences in the full path were triggering a false positive when there were actually no changes. Signed-off-by: Jate Sujjavanich jate.sujjavan...@myfuelmaster.com --- bitbake/lib/bb/fetch2/__init__.py | 14 +- bitbake/lib/bb/siggen.py |3 ++- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py index dd1cc93..7ab44d7 100644 --- a/bitbake/lib/bb/fetch2/__init__.py +++ b/bitbake/lib/bb/fetch2/__init__.py @@ -900,8 +900,7 @@ def get_checksum_file_list(d): return .join(filelist) - -def get_file_checksums(filelist, pn): +def get_file_checksums(filelist, pn, topdir): Get a list of the checksums for a list of local files Returns the checksums for a list of local files, caching the results as @@ -917,7 +916,12 @@ def get_file_checksums(filelist, pn): bb.warn(Unable to get checksum for %s SRC_URI entry %s: %s % (pn, os.path.basename(f), e)) return None return checksum + +(recipe_root, _) = os.path.split(topdir) +def remove_recipe_parent(data): +return data.replace(recipe_root, '').strip('/') + checksums = [] for pth in filelist.split(): checksum = None @@ -927,7 +931,7 @@ def get_file_checksums(filelist, pn): for f in glob.glob(pth): checksum = checksum_file(f) if checksum: -checksums.append((f, checksum)) +checksums.append((remove_recipe_parent(f), + checksum)) elif os.path.isdir(pth): # Handle directories for root, dirs, files in os.walk(pth): @@ -935,12 +939,12 @@ def get_file_checksums(filelist, pn): fullpth = os.path.join(root, name) checksum = checksum_file(fullpth) if checksum: -checksums.append((fullpth, checksum)) + + checksums.append((remove_recipe_parent(fullpth), checksum)) else: checksum = checksum_file(pth) if checksum: -checksums.append((pth, checksum)) +checksums.append((remove_recipe_parent(pth), checksum)) checksums.sort(key=operator.itemgetter(1)) return checksums diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py index 8861337..c64acfe 100644 --- a/bitbake/lib/bb/siggen.py +++ b/bitbake/lib/bb/siggen.py @@ -74,6 +74,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.pkgnameextract = re.compile((?Pfn.*)\..*) self.basewhitelist = set((data.getVar(BB_HASHBASE_WHITELIST, True) or ).split()) self.taskwhitelist = None +self.topdir = data.getVar(TOPDIR, True) self.init_rundepcheck(data) def init_rundepcheck(self, data): @@ -187,7 +188,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.runtaskdeps[k].append(dep) if task in dataCache.file_checksums[fn]: -checksums = bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], recipename) +checksums = + bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], + recipename, self.topdir) for (f,cs) in checksums: self.file_checksum_values[k][f] = cs data = data + cs Good catch! The only thing is, this will not help for files within different layers which may not be underneath TOPDIR; I think we'll need a function that determines which layer the file is under (longest path match from data.getVar('BBLAYERS', True).split()) and then take that path off the beginning. Additionally, this is a patch against bitbake so it will need to go to the bitbake-de...@lists.openembedded.org mailing list. Actually, looking more closely at this I'm not sure how the full path to the file would be getting into the signature - looking at lib/bb/siggen.py it should only be adding the file checksum value to the signature data and not the path. I did a quick test with master by moving some files referred to in SRC_URI to a different valid location (thus changing their full path), cleaning the recipe and then building it again, and the output was restored from sstate rather than rebuilding. Can you explain how you came to the conclusion that this was why the checksums were
Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path
On the same machine, I had two identical poky directories. One, I built first to use as the cache source. I then configured the second one to use a SSTATE_MIRRORS pointing to the first. The build of the second would just detect changes and force a rebuild. Running the bitbake-diffsigs tool on the do_fetch...sigdata showed something like the following on all of the local files: Removed(/home/user/poky1/meta/recipes...file1, chksum1) Added(/home/user/poky2/meta/recipes...file1, chksum1) Upon applying my patch, the components of core-minimal-image were pulled from the sstate-cache. I will have to see about the other local file cases that I don't know about. :) -Jate From: Paul Eggleton [paul.eggle...@linux.intel.com] Sent: Wednesday, June 19, 2013 11:45 AM To: Jate Sujjavanich Cc: openembedded-core@lists.openembedded.org Subject: Re: [OE-core] [PATCH] Checksums for local files now stored using partial recipe path On Wednesday 19 June 2013 16:24:53 Paul Eggleton wrote: Hi Jate, On Wednesday 19 June 2013 11:08:10 Jate Sujjavanich wrote: This allows sstate-cache to be shared between builds in different directories. Differences in the full path were triggering a false positive when there were actually no changes. Signed-off-by: Jate Sujjavanich jate.sujjavan...@myfuelmaster.com --- bitbake/lib/bb/fetch2/__init__.py | 14 +- bitbake/lib/bb/siggen.py |3 ++- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py index dd1cc93..7ab44d7 100644 --- a/bitbake/lib/bb/fetch2/__init__.py +++ b/bitbake/lib/bb/fetch2/__init__.py @@ -900,8 +900,7 @@ def get_checksum_file_list(d): return .join(filelist) - -def get_file_checksums(filelist, pn): +def get_file_checksums(filelist, pn, topdir): Get a list of the checksums for a list of local files Returns the checksums for a list of local files, caching the results as @@ -917,7 +916,12 @@ def get_file_checksums(filelist, pn): bb.warn(Unable to get checksum for %s SRC_URI entry %s: %s % (pn, os.path.basename(f), e)) return None return checksum + +(recipe_root, _) = os.path.split(topdir) +def remove_recipe_parent(data): +return data.replace(recipe_root, '').strip('/') + checksums = [] for pth in filelist.split(): checksum = None @@ -927,7 +931,7 @@ def get_file_checksums(filelist, pn): for f in glob.glob(pth): checksum = checksum_file(f) if checksum: -checksums.append((f, checksum)) +checksums.append((remove_recipe_parent(f), + checksum)) elif os.path.isdir(pth): # Handle directories for root, dirs, files in os.walk(pth): @@ -935,12 +939,12 @@ def get_file_checksums(filelist, pn): fullpth = os.path.join(root, name) checksum = checksum_file(fullpth) if checksum: -checksums.append((fullpth, checksum)) + + checksums.append((remove_recipe_parent(fullpth), checksum)) else: checksum = checksum_file(pth) if checksum: -checksums.append((pth, checksum)) +checksums.append((remove_recipe_parent(pth), checksum)) checksums.sort(key=operator.itemgetter(1)) return checksums diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py index 8861337..c64acfe 100644 --- a/bitbake/lib/bb/siggen.py +++ b/bitbake/lib/bb/siggen.py @@ -74,6 +74,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.pkgnameextract = re.compile((?Pfn.*)\..*) self.basewhitelist = set((data.getVar(BB_HASHBASE_WHITELIST, True) or ).split()) self.taskwhitelist = None +self.topdir = data.getVar(TOPDIR, True) self.init_rundepcheck(data) def init_rundepcheck(self, data): @@ -187,7 +188,7 @@ class SignatureGeneratorBasic(SignatureGenerator): self.runtaskdeps[k].append(dep) if task in dataCache.file_checksums[fn]: -checksums = bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], recipename) +checksums = + bb.fetch2.get_file_checksums(dataCache.file_checksums[fn][task], + recipename, self.topdir) for (f,cs) in checksums: self.file_checksum_values[k][f] = cs data = data + cs Good catch! The only thing is, this will not help for files within different layers which may not be underneath TOPDIR; I think we'll need a function that determines which layer the file is under (longest path match from data.getVar('BBLAYERS', True).split()) and then take that path off the beginning