[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-528186487 Therefore, I suggest we may change the input shape assertion of the ADD function in this way: if the input shapes of the two operands are the same, it also passes the assertion: `assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) )` If this amendment is okay, I will add this change into this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-528186487 Therefore, I suggest we may change the input shape assertion of the ADD function in this way: if the input shapes of the two operands are the same, it also passes the assertion: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) ) ` If this amendment is okay, I will add this change into this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-528186487 Therefore, I suggest we may change the input shape assertion of the ADD function in this way: if the input shapes of the two operands are the same, it also passes the assertion: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) )``` If this amendment is okay, I will add this change into this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-528186487 Therefore, I suggest we may change the input shape assertion of the ADD function in this way: if the input shapes of the two operands are the same, it also passes the assertion: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) ) If this amendment is okay, I will add this change into this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-528186487 Therefore, I suggest we may change the input shape assertion of the ADD function in this way: if the input shapes of the two operands are the same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-528186487 Therefore, I suggest we may change the input shape assertion of the ADD function in this way: if the input shapes of the two operands are the same, it also passes the assertion: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or we can change the assert in this way: if the input shapes of the two operands are the same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are the same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0) == len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0)==len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0)==len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert ( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0)==len(self.shape1)) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) or (len(self.shape0)-len(self.shape1)==0) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) and (len(self.shape0)-len(self.shape1)!=0) )``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) and (len(self.shape0)-len(self.shape1)!=0) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: ```python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) and (len(self.shape0)-len(self.shape1)!=0) )``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: python `assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) and (len(self.shape0)-len(self.shape1)!=0) )` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: `python assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) and (len(self.shape0)-len(self.shape1)!=0) )` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height Or if the input shapes of the two operands are two same, it also passes: `assert( (len(self.shape0) <= 2 and len(self.shape1) <= 2) and (len(self.shape0)-len(self.shape1)!=0) )` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting (#524). Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2 (typically there are four dimensions for resnet feature maps: batch, depth/channel, width, height), the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) #assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for conv. feature maps: batch, depth/channel, width, height This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting (https://github.com/apache/incubator-singa/pull/524 or #520). Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2 (typically there are four dimensions for feature maps: batch, depth/channel, width, height), the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting (#524). Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2 (typically there are four dimensions for feature maps: batch, depth/channel, width, height), the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting (https://github.com/apache/incubator-singa/pull/520). Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2 (typically there are four dimensions for feature maps: batch, depth/channel, width, height), the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting (https://github.com/apache/incubator-singa/pull/522). Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2 (typically there are four dimensions for feature maps: batch, depth/channel, width, height), the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) #assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion). Typically there are four dimensions for feature maps: batch, depth/channel, width, height This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2 (typically there are four dimensions for feature maps: batch, depth/channel, width, height), the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) #assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension to ADD more than two? (e.g. change the limit 2 to 4 or disable the assertion) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) #assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension more than two? (e.g. change 2 to 4 or disable the assertion) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527824025 > I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" > AssertionError: up till now, the dimensions of tensor a and b should less than 3) > Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error When I disable the assertion `assert(len(self.shape0) <= 2 and len(self.shape1) <= 2)`, the resnet.py can run successfully See the code of Add function ```python class Add(Operation): def __init__(self): super(Add, self).__init__() def forward(self, a, b): #up till now, the dimensions of tensor a and b should less than 3 self.shape0=list(a.shape()) self.shape1=list(b.shape()) #assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" return singa.__add__(a, b) def backward(self, dy): if(type(dy)==float): assert self.shape0==self.shape1,('should have same shape') return dy,dy db=CTensor(list(dy.shape()), dy.device()) db.CopyData(dy) for i in range(len(self.shape0)-len(self.shape1)): db=singa.Sum(db, 0) return dy, db ``` Can we allow input dimension more than two? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2 (the first dimension is batch), the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false and hence assertion error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used "out = autograd.add(out, residual)", the input to the add function should have a dimension more than 2, the assert function should return always false This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-527812086 I have combined all the commits into two commits. Meanwhile, I found that the resnet.py is not compatible with the master branch modified "Add" function with broadcasting. Get the error (assert(len(self.shape0) <= 2 and len(self.shape1) <= 2),"up till now, the dimensions of tensor a and b should less than 3" AssertionError: up till now, the dimensions of tensor a and b should less than 3) Since in resnet we used out = autograd.add(out, residual), the input to the add function should have a dimension more than 2, the assert function should return always false This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-523749828 merged from latest master again, the sonnx.py is the same as master Also, retested in AWS This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521635425 Updated on 15 Aug 2019 Latest successful build log just now after the commit 3076113 to add the license header, which also successfully build and run the jupyter notebook example. ``` ubuntu@ip-172-31-17-155:~/incubator-singa$ mkdir build ubuntu@ip-172-31-17-155:~/incubator-singa$ cd build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ cmake -D CMAKE_PREFIX_PATH="/usr/local/cuda/lib64;/usr/local/cuda/" -DENABLE_TEST=OFF -DUSE_CUDA=ON -DUSE_PYTHON3=ON -DUSE_MKLDNN=ON -DUSE_MODULES=OFF -DUSE_DIST=ON .. -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- Found Protobuf: /usr/local/lib/libprotobuf.so;-lpthread (found suitable version "3.0.0", minimum required is "3.0") -- Found CBLAS: /usr/local/include -- Found GLOG: /usr/include -- Found cuda_v10.0 -- Found CUDNN: /usr/local/cuda/include -- Found Cudnn_7401 at /usr/local/cuda/include /usr/local/cuda/lib64/libcudnn.so -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.5.2", minimum required is "3") -- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (found suitable version "3.5.2", minimum required is "3") -- Found SWIG: /usr/local/bin/swig (found suitable version "3.0.12", minimum required is "3.0.10") -- Found MKLDNN at /usr/local/include -- Found MPI at /home/ubuntu/mpich-3.3/build/include -- Found MPI lib at /home/ubuntu/mpich-3.3/build/lib/libmpi.so -- Found all lib at /usr/local/lib/libprotobuf.so;/usr/local/lib/libopenblas.so;/usr/lib/x86_64-linux-gnu/libglog.so;/usr/local/cuda/lib64/libcudnn.so;/usr/local/cuda/lib64/libcudart.so;/usr/local/cuda/lib64/libcurand.so;/usr/local/cuda/lib64/libcublas.so;/home/ubuntu/incubator-singa/build/lib/libcnmem.a;/usr/local/lib/libmkldnn.so;/home/ubuntu/mpich-3.3/build/lib/libmpi.so;/home/ubuntu/mpich-3.3/build/lib/libmpicxx.so -- Found NCCL at /usr/local/cuda/include -- Found NCCL lib at /usr/local/cuda/lib/libnccl.so -- Configuring done -- Generating done -- Build files have been written to: /home/ubuntu/incubator-singa/build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ make -j2 Scanning dependencies of target cnmem Scanning dependencies of target copy_protobuf [ 1%] Creating directories for 'cnmem' [ 2%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/model.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: model.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 3%] Performing download step (git clone) for 'cnmem' Cloning into 'cnmem'... [ 4%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/caffe.proto [ 5%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/core.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: core.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 6%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/io.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: io.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 7%] Copying Protobuf headers [ 7%] Built target copy_protobuf [ 8%] Building NVCC (Device) object src/CMakeFiles/cuda_compile_1.dir/core/tensor/cuda_compile_1_generated_math_kernel.cu.o Already on 'master' Your branch is up-to-date with 'origin/master'. [ 9%] No patch step for 'cnmem' [ 10%] Performing update step for 'cnmem' Current branch master is up to date. [ 11%] Performing configure step for 'cnmem' --
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521635425 Updated on 15 Aug 2019 Latest successful build log just now after the commit 3076113 to add the license header, which also successfully build and run the jupyter notebook example. ``` ubuntu@ip-172-31-17-155:~/incubator-singa$ mkdir build ubuntu@ip-172-31-17-155:~/incubator-singa$ cd build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ cmake -D CMAKE_PREFIX_PATH="/usr/local/cuda/lib64;/usr/local/cuda/" -DENABLE_TEST=OFF -DUSE_CUDA=ON -DUSE_PYTHON3=ON -DUSE_MKLDNN=ON -DUSE_MODULES=OFF -DUSE_DIST=ON .. -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- Found Protobuf: /usr/local/lib/libprotobuf.so;-lpthread (found suitable version "3.0.0", minimum required is "3.0") -- Found CBLAS: /usr/local/include -- Found GLOG: /usr/include -- Found cuda_v10.0 -- Found CUDNN: /usr/local/cuda/include -- Found Cudnn_7401 at /usr/local/cuda/include /usr/local/cuda/lib64/libcudnn.so -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.5.2", minimum required is "3") -- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (found suitable version "3.5.2", minimum required is "3") -- Found SWIG: /usr/local/bin/swig (found suitable version "3.0.12", minimum required is "3.0.10") -- Found MKLDNN at /usr/local/include -- Found MPI at /home/ubuntu/mpich-3.3/build/include -- Found MPI lib at /home/ubuntu/mpich-3.3/build/lib/libmpi.so -- Found all lib at /usr/local/lib/libprotobuf.so;/usr/local/lib/libopenblas.so;/usr/lib/x86_64-linux-gnu/libglog.so;/usr/local/cuda/lib64/libcudnn.so;/usr/local/cuda/lib64/libcudart.so;/usr/local/cuda/lib64/libcurand.so;/usr/local/cuda/lib64/libcublas.so;/home/ubuntu/incubator-singa/build/lib/libcnmem.a;/usr/local/lib/libmkldnn.so;/home/ubuntu/mpich-3.3/build/lib/libmpi.so;/home/ubuntu/mpich-3.3/build/lib/libmpicxx.so -- Found NCCL at /usr/local/cuda/include -- Found NCCL lib at /usr/local/cuda/lib/libnccl.so -- Configuring done -- Generating done -- Build files have been written to: /home/ubuntu/incubator-singa/build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ make -j2 Scanning dependencies of target cnmem Scanning dependencies of target copy_protobuf [ 1%] Creating directories for 'cnmem' [ 2%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/model.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: model.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 3%] Performing download step (git clone) for 'cnmem' Cloning into 'cnmem'... [ 4%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/caffe.proto [ 5%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/core.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: core.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 6%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/io.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: io.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 7%] Copying Protobuf headers [ 7%] Built target copy_protobuf [ 8%] Building NVCC (Device) object src/CMakeFiles/cuda_compile_1.dir/core/tensor/cuda_compile_1_generated_math_kernel.cu.o Already on 'master' Your branch is up-to-date with 'origin/master'. [ 9%] No patch step for 'cnmem' [ 10%] Performing update step for 'cnmem' Current branch master is up to date. [ 11%] Performing configure step for 'cnmem' --
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521635425 Updated on 15 Aug 2019 Latest successful build log just now after the commit 3076113 to add the license header, which also successfully build and run the jupyter notebook example. ``` ubuntu@ip-172-31-17-155:~/incubator-singa$ mkdir build ubuntu@ip-172-31-17-155:~/incubator-singa$ cd build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ cmake -D CMAKE_PREFIX_PATH="/usr/local/cuda/lib64;/usr/local/cuda/" -DENABLE_TEST=OFF -DUSE_CUDA=ON -DUSE_PYTHON3=ON -DUSE_MKLDNN=ON -DUSE_MODULES=OFF -DUSE_DIST=ON .. -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- Found Protobuf: /usr/local/lib/libprotobuf.so;-lpthread (found suitable version "3.0.0", minimum required is "3.0") -- Found CBLAS: /usr/local/include -- Found GLOG: /usr/include -- Found cuda_v10.0 -- Found CUDNN: /usr/local/cuda/include -- Found Cudnn_7401 at /usr/local/cuda/include /usr/local/cuda/lib64/libcudnn.so -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.5.2", minimum required is "3") -- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (found suitable version "3.5.2", minimum required is "3") -- Found SWIG: /usr/local/bin/swig (found suitable version "3.0.12", minimum required is "3.0.10") -- Found MKLDNN at /usr/local/include -- Found MPI at /home/ubuntu/mpich-3.3/build/include -- Found MPI lib at /home/ubuntu/mpich-3.3/build/lib/libmpi.so -- Found all lib at /usr/local/lib/libprotobuf.so;/usr/local/lib/libopenblas.so;/usr/lib/x86_64-linux-gnu/libglog.so;/usr/local/cuda/lib64/libcudnn.so;/usr/local/cuda/lib64/libcudart.so;/usr/local/cuda/lib64/libcurand.so;/usr/local/cuda/lib64/libcublas.so;/home/ubuntu/incubator-singa/build/lib/libcnmem.a;/usr/local/lib/libmkldnn.so;/home/ubuntu/mpich-3.3/build/lib/libmpi.so;/home/ubuntu/mpich-3.3/build/lib/libmpicxx.so -- Found NCCL at /usr/local/cuda/include -- Found NCCL lib at /usr/local/cuda/lib/libnccl.so -- Configuring done -- Generating done -- Build files have been written to: /home/ubuntu/incubator-singa/build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ make -j2 Scanning dependencies of target cnmem Scanning dependencies of target copy_protobuf [ 1%] Creating directories for 'cnmem' [ 2%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/model.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: model.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 3%] Performing download step (git clone) for 'cnmem' Cloning into 'cnmem'... [ 4%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/caffe.proto [ 5%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/core.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: core.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 6%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/io.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: io.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 7%] Copying Protobuf headers [ 7%] Built target copy_protobuf [ 8%] Building NVCC (Device) object src/CMakeFiles/cuda_compile_1.dir/core/tensor/cuda_compile_1_generated_math_kernel.cu.o Already on 'master' Your branch is up-to-date with 'origin/master'. [ 9%] No patch step for 'cnmem' [ 10%] Performing update step for 'cnmem' Current branch master is up to date. [ 11%] Performing configure step for 'cnmem' --
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521635425 Updated on 15 Aug 2019 Latest successful build log just now after the commit 3076113 to add the license header, which also successfully run the jupyter notebook example. ``` ubuntu@ip-172-31-17-155:~/incubator-singa$ mkdir build ubuntu@ip-172-31-17-155:~/incubator-singa$ cd build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ cmake -D CMAKE_PREFIX_PATH="/usr/local/cuda/lib64;/usr/local/cuda/" -DENABLE_TEST=OFF -DUSE_CUDA=ON -DUSE_PYTHON3=ON -DUSE_MKLDNN=ON -DUSE_MODULES=OFF -DUSE_DIST=ON .. -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- Found Protobuf: /usr/local/lib/libprotobuf.so;-lpthread (found suitable version "3.0.0", minimum required is "3.0") -- Found CBLAS: /usr/local/include -- Found GLOG: /usr/include -- Found cuda_v10.0 -- Found CUDNN: /usr/local/cuda/include -- Found Cudnn_7401 at /usr/local/cuda/include /usr/local/cuda/lib64/libcudnn.so -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.5.2", minimum required is "3") -- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (found suitable version "3.5.2", minimum required is "3") -- Found SWIG: /usr/local/bin/swig (found suitable version "3.0.12", minimum required is "3.0.10") -- Found MKLDNN at /usr/local/include -- Found MPI at /home/ubuntu/mpich-3.3/build/include -- Found MPI lib at /home/ubuntu/mpich-3.3/build/lib/libmpi.so -- Found all lib at /usr/local/lib/libprotobuf.so;/usr/local/lib/libopenblas.so;/usr/lib/x86_64-linux-gnu/libglog.so;/usr/local/cuda/lib64/libcudnn.so;/usr/local/cuda/lib64/libcudart.so;/usr/local/cuda/lib64/libcurand.so;/usr/local/cuda/lib64/libcublas.so;/home/ubuntu/incubator-singa/build/lib/libcnmem.a;/usr/local/lib/libmkldnn.so;/home/ubuntu/mpich-3.3/build/lib/libmpi.so;/home/ubuntu/mpich-3.3/build/lib/libmpicxx.so -- Found NCCL at /usr/local/cuda/include -- Found NCCL lib at /usr/local/cuda/lib/libnccl.so -- Configuring done -- Generating done -- Build files have been written to: /home/ubuntu/incubator-singa/build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ make -j2 Scanning dependencies of target cnmem Scanning dependencies of target copy_protobuf [ 1%] Creating directories for 'cnmem' [ 2%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/model.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: model.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 3%] Performing download step (git clone) for 'cnmem' Cloning into 'cnmem'... [ 4%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/caffe.proto [ 5%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/core.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: core.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 6%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/io.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: io.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 7%] Copying Protobuf headers [ 7%] Built target copy_protobuf [ 8%] Building NVCC (Device) object src/CMakeFiles/cuda_compile_1.dir/core/tensor/cuda_compile_1_generated_math_kernel.cu.o Already on 'master' Your branch is up-to-date with 'origin/master'. [ 9%] No patch step for 'cnmem' [ 10%] Performing update step for 'cnmem' Current branch master is up to date. [ 11%] Performing configure step for 'cnmem' -- The C comp
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521635425 Updated on 15 Aug 2019 Latest successful build log just now after the commit 3076113 to add the license header, which also successfully build and run the jupyter notebook example. ``` ubuntu@ip-172-31-17-155:~/incubator-singa$ mkdir build ubuntu@ip-172-31-17-155:~/incubator-singa$ cd build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ cmake -D CMAKE_PREFIX_PATH="/usr/local/cuda/lib64;/usr/local/cuda/" -DENABLE_TEST=OFF -DUSE_CUDA=ON -DUSE_PYTHON3=ON -DUSE_MKLDNN=ON -DUSE_MODULES=OFF -DUSE_DIST=ON .. -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- Found Protobuf: /usr/local/lib/libprotobuf.so;-lpthread (found suitable version "3.0.0", minimum required is "3.0") -- Found CBLAS: /usr/local/include -- Found GLOG: /usr/include -- Found cuda_v10.0 -- Found CUDNN: /usr/local/cuda/include -- Found Cudnn_7401 at /usr/local/cuda/include /usr/local/cuda/lib64/libcudnn.so -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.5.2", minimum required is "3") -- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (found suitable version "3.5.2", minimum required is "3") -- Found SWIG: /usr/local/bin/swig (found suitable version "3.0.12", minimum required is "3.0.10") -- Found MKLDNN at /usr/local/include -- Found MPI at /home/ubuntu/mpich-3.3/build/include -- Found MPI lib at /home/ubuntu/mpich-3.3/build/lib/libmpi.so -- Found all lib at /usr/local/lib/libprotobuf.so;/usr/local/lib/libopenblas.so;/usr/lib/x86_64-linux-gnu/libglog.so;/usr/local/cuda/lib64/libcudnn.so;/usr/local/cuda/lib64/libcudart.so;/usr/local/cuda/lib64/libcurand.so;/usr/local/cuda/lib64/libcublas.so;/home/ubuntu/incubator-singa/build/lib/libcnmem.a;/usr/local/lib/libmkldnn.so;/home/ubuntu/mpich-3.3/build/lib/libmpi.so;/home/ubuntu/mpich-3.3/build/lib/libmpicxx.so -- Found NCCL at /usr/local/cuda/include -- Found NCCL lib at /usr/local/cuda/lib/libnccl.so -- Configuring done -- Generating done -- Build files have been written to: /home/ubuntu/incubator-singa/build ubuntu@ip-172-31-17-155:~/incubator-singa/build$ make -j2 Scanning dependencies of target cnmem Scanning dependencies of target copy_protobuf [ 1%] Creating directories for 'cnmem' [ 2%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/model.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: model.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 3%] Performing download step (git clone) for 'cnmem' Cloning into 'cnmem'... [ 4%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/caffe.proto [ 5%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/core.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: core.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 6%] Running C++ protocol buffer compiler on /home/ubuntu/incubator-singa/src/proto/io.proto [libprotobuf WARNING google/protobuf/compiler/parser.cc:547] No syntax specified for the proto file: io.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.) [ 7%] Copying Protobuf headers [ 7%] Built target copy_protobuf [ 8%] Building NVCC (Device) object src/CMakeFiles/cuda_compile_1.dir/core/tensor/cuda_compile_1_generated_math_kernel.cu.o Already on 'master' Your branch is up-to-date with 'origin/master'. [ 9%] No patch step for 'cnmem' [ 10%] Performing update step for 'cnmem' Current branch master is up to date. [ 11%] Performing configure step for 'cnmem' --
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Already resolved by restore to master branch meta.yaml because the - libprotobuf 3.6.1 - libopenblas 0.3.3 already added before in master branch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Also the change in the following conda meta.yaml file I am not sure ![change_in_meta_yaml](https://user-images.githubusercontent.com/38325429/63027003-bcbc8780-bede-11e9-9c77-b60f8aa6bf48.png) for example, don't know why comment out the git_url and what is the effect in conda-build (maybe in this way the source code is taken from the local disk, instead of from git repo) After comparing the code with master branch, I already retore the file to master branch's "meta.yaml" becuase the - libprotobuf 3.6.1 - libopenblas 0.3.3 already there in the master branch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Also the change in the following conda meta.yaml file I am not sure ![change_in_meta_yaml](https://user-images.githubusercontent.com/38325429/63027003-bcbc8780-bede-11e9-9c77-b60f8aa6bf48.png) for example, don't know why comment out the git_url and what is the effect in conda-build (maybe in this way the source code is taken from the local disk, instead of from git repo) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Also the change in the following conda meta.yaml file I am not sure ![change_in_meta_yaml](https://user-images.githubusercontent.com/38325429/63027003-bcbc8780-bede-11e9-9c77-b60f8aa6bf48.png) for example, don't know why comment out the git_url and what is the effect in conda (maybe in this way the source code is taken in the local disk, otherwise in git repo) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Also the change in the following conda meta.yaml file I am not sure ![change_in_meta_yaml](https://user-images.githubusercontent.com/38325429/63027003-bcbc8780-bede-11e9-9c77-b60f8aa6bf48.png) for example, don't know why comment out the git_url and what is the effect in conda (maybe in this way the source code is taken from the local disk, otherwise in git repo) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Also the change in the following conda meta.yaml file I am not sure ![change_in_meta_yaml](https://user-images.githubusercontent.com/38325429/63027003-bcbc8780-bede-11e9-9c77-b60f8aa6bf48.png) for example, don't know why comment out the git_url and what is the effect in conda (maybe in this way the source code is taken from the local disk, instead of from git repo) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521270716 I have created an additional image in AWS: Jupyter Notebook Demo (Distributed Module) version 2 based on the current status (i.e. updated dist_new by merging a copy of lastest master branch) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521247206 > Just now I did a test: When I replace the merged tensor.cc file by the latest master branch tensor.cc, the loss can be reduced again. So I just changed the tensor.cc of dist_new to the latest master branch one. This act would omit the commit f54a526 done on tensor.cc > > I guess this maybe because: > The commit f54a526 in dist_new has redesigned the softmax function by calling cudnn, but this is not compatiable with the latest master branch. OK, I check again in the AWS image, in the previous two months the code we used for distributed module has commented out the code done by the commit f54a526 on softmax function, so softmax is the same as master branch. This is the code of softmax we used in the AWS image of distributed module: ```cpp void SoftMax(const Tensor &in, Tensor *out) { CHECK_LE(in.nDim(), 2u); out->CopyData(in); size_t nrow = 1, ncol = in.Size(), size = ncol; if (in.nDim() == 2u) { nrow = in.shape(0); ncol = size / nrow; out->Reshape(Shape{nrow, ncol}); } Tensor tmp = RowMax(*out); SubColumn(tmp, out); Exp(*out, out); SumColumns(*out, &tmp); DivColumn(tmp, out); out->Reshape(in.shape()); } // void SoftMax(const Tensor &in, Tensor *out) { // CHECK_LE(in.nDim(), 2u); // TYPE_LANG_SWITCH(in.data_type(), DType, in.device()->lang(), Lang, { // out->device()->Exec([in, out](Context * ctx) { // SoftMax(in, out, ctx); // }, {in.block(), out->block()}, {out->block()}); // }); // } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521270716 I have created an additional image in AWS: Jupyter Notebook Demo (Distributed Module) version 2 based on the current status (updated dist_new by merging a copy of lastest master branch) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521270716 I have created an additional image in AWS: Jupyter Notebook Demo (Distributed Module) version 2 based on the current status of dist_new (updated dist_new by merging a copy of lastest master branch) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Also the change in the following conda meta.yaml file I am not sure ![change_in_meta_yaml](https://user-images.githubusercontent.com/38325429/63027003-bcbc8780-bede-11e9-9c77-b60f8aa6bf48.png) for example, don't know why comment out the git_url and what is the effect in conda This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521258077 Also the change in the following conda meta.yaml file I am not sure ![change_in_meta_yaml](https://user-images.githubusercontent.com/38325429/63027003-bcbc8780-bede-11e9-9c77-b60f8aa6bf48.png) for example, don't know why comment out the git_url This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521247206 > Just now I did a test: When I replace the merged tensor.cc file by the latest master branch tensor.cc, the loss can be reduced again. So I just changed the tensor.cc of dist_new to the latest master branch one. This act would omit the commit f54a526 done on tensor.cc > > I guess this maybe because: > The commit f54a526 in dist_new has redesigned the softmax function by calling cudnn, but this is not compatiable with the latest master branch. OK, I check again in the AWS image, in the previous two months the code we used for distributed module has commented out the code done by the commit f54a526 on softmax function, so softmax is the same as master branch. This is the code of softmax we used in the AWS image of distributed module: ``` void SoftMax(const Tensor &in, Tensor *out) { CHECK_LE(in.nDim(), 2u); out->CopyData(in); size_t nrow = 1, ncol = in.Size(), size = ncol; if (in.nDim() == 2u) { nrow = in.shape(0); ncol = size / nrow; out->Reshape(Shape{nrow, ncol}); } Tensor tmp = RowMax(*out); SubColumn(tmp, out); Exp(*out, out); SumColumns(*out, &tmp); DivColumn(tmp, out); out->Reshape(in.shape()); } // void SoftMax(const Tensor &in, Tensor *out) { // CHECK_LE(in.nDim(), 2u); // TYPE_LANG_SWITCH(in.data_type(), DType, in.device()->lang(), Lang, { // out->device()->Exec([in, out](Context * ctx) { // SoftMax(in, out, ctx); // }, {in.block(), out->block()}, {out->block()}); // }); // } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-521235062 > Just now I did a test: When I replace the merged tensor.cc file by the latest master branch tensor.cc, the loss can be reduced again. So I just changed the tensor.cc of dist_new to the latest master branch one. This act would omit the commit f54a526 done on tensor.cc I guess this maybe because: The commit f54a526 in dist_new has redesigned the softmax function by calling cudnn, but this is not compatiable with the latest master branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-503007595 I have run the example code of resnet50 test under the directory examples/autograd/resnet_dist.py using 4 AWS instances of p2.x8large (totally 32 GPUs of the model K80). The program was finished with no error returned. The dependencies used: nccl 2.4.7 mpich 3.3 The command of the test was: //mpiexec --hostfile python3 ./incubator-singa/examples/autograd/resnet_dist.py The GPU utilization for node 4 was: ![gpu utilization](https://user-images.githubusercontent.com/38325429/59667326-1c512c00-91e9-11e9-9925-7803263d8b17.png) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-503007595 I have run the example code of resnet50 test in the directory examples/autograd/resnet_dist.py using 4 AWS instances of p2.x8large (totally 32 GPUs of the model K80). The program was finished with no error returned. The dependencies used: nccl 2.4.7 mpich 3.3 The command of the test was: //mpiexec --hostfile python3 ./incubator-singa/examples/autograd/resnet_dist.py The GPU utilization for node 4 was: ![gpu utilization](https://user-images.githubusercontent.com/38325429/59667326-1c512c00-91e9-11e9-9925-7803263d8b17.png) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-singa] chrishkchris edited a comment on issue #468: Distributted module
chrishkchris edited a comment on issue #468: Distributted module URL: https://github.com/apache/incubator-singa/pull/468#issuecomment-503007595 I have run the example code of resnet50 test in the directory examples/autograd/resnet_dist.py using 4 AWS instances of p2.x8large (totally 32 GPUs of the model K80). The program was finished with no error returned. ![gpu utilization](https://user-images.githubusercontent.com/38325429/59667326-1c512c00-91e9-11e9-9925-7803263d8b17.png) The dependencies used: nccl 2.4.7 mpich 3.3 The command of the test was: //mpiexec --hostfile python3 ./incubator-singa/examples/autograd/resnet_dist.py This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services